BEST AVAILABLE COPY 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
3 January 2003 (03.01.2003) 




PCT 



(10) International Publication Number 

WO 03/000905 A2 



(51) International Patent Classification 7 : C12N 15/82 

(21) International Application Number: PCT/IB02/02450 

(22) International Filing Date: 21 June 2002 (21.06.2002) 



(25) Filing Language: 



(26) Publication Language: 



English 



English 



(30) Priority Data: 

60/300,112 
60/325,277 
60/342,327 



22 June 2001 (22.06.2001) US 
26 September 2001 (26.09.2001) US 
20 December 2001 (20.12.2001) US 



(71) Applicant (for all designated States except US): SYN- 
GENTA PARTICIPATIONS AG [CH/CH]; Schwarzwal- 
dallee 215, CH-4058 Basel (CH). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): ZHU, Tong 
[US/US]; Torrey Mesa Research Institute, 3115 Merry- 
field Row, San Diego, CA 92121-1125 (US). CHENG, 
Wenqiong [CN/US]; Torre Mesa Research Institute, 
Inc., 3115 Merryfield Row, San Diego, CA 92121-1125 
(US). BRIGGS, Steven [US/US]; Torrey Mesa Research 
Institute, Inc., 3115 Merryfield Row, San Diego, CA 
92121-1125 (US). COOPER, Bret [US/US]; Torrey Mesa 
Research Institute, 3115 Merryfield Row, San Diego, CA 
92121 (US). GOFF, Stephen, A. [US/US]; Torrey Mesa 
Research Institute, Inc., 31 15 Merryfield Row, San Diego, 
CA 92121-1125 (US). MOUGHAMER, Todd [US/US]; 
Torrey Mesa Research Institute, Inc., 3115 Merryfield Row, 
San Diego, CA 92121-1 125 (US). GLAZEBROOK, Jane 
[US/US] ; Torrey Mesa Research Institute, 3115 Merryfield 
Row, San Diego, CA 92121-1125 (US). KATAGIRI, 
Fumiaki [JP/US]; Torrey Mesa Research Institute, 3115 



Merryfield Row, San Diego, CA 92121-1125 (US). 
KREPS, Joel [US/US]; Torrey Mesa Research Institute, 
Inc., 3115 Merryfield Row, San Diego, CA 92121-1125 
(US). PROVART, Nicolas [CA/CA]; 33 Longwood 
Drive, M3B 1T9 Toronto, Ontario (CA). RICKE, Darrell 
[US/US]; Torrey Mesa Research Institute, Inc., 3115 
Merryfield Row, San Diego, CA 92121-1 115 (US). 

(74) Agent: BASTIAN, Werner; Syngenta Participations AG, 
Intellectual Property, P.O.Box, CH-4002 Basel (CH). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, CH, CY, DE, DK, ES, H, FR, 
GB, GR, IE, IT, LU, MC, NL, PT, SE, TR), OAPI patent 
(BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, 
NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of that report 

— with sequence listing part of description published sepa- 
rately in electronic form and available upon request from 
the International Bureau 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



< 
it) 

o 

OS 

© 

m (54) Title: IDENTIFICATION AND CHARACTERIZATION OF PLANT GENES 

^ (57) Abstract: The invention discloses a set of genes the expression products of which are up-regulated during the grain filling 
process in rice and active in different metabolic pathways involved in nutrient partitioning. The invention also discloses the use of 
said genes to modify the compositional and nutritional characteristics of the plant grain. 



Jinrui Shi REF ] 

Serial No. 10/042,894 A22 f 



WO 03/000905 



PCT/IB02/02450 



IDENTIFICATION AND CHARACTERIZATION OF PLANT GENES 

The present invention is in the area of plant biotechnology. In particular, the invention relates to a set 
of genes the expression products of which are up-regulated during the grain filling process in rice and 
5 active in different metabolic pathways involved in nutrient partitioning. The invention also relates to 
the use of said genes to modify the compositional and nutritional characteristics of the plant grain. 

It has been long recognized that the value of agricultural products such as cereal grains and the like 
are affected by the quality of their inherent constituent components: In particular, cereal grains with 
10 improved protein, oil, starch, fiber, and moisture content and desirable levels of carbohydrates and 
other constituents are of economic interest. 

In rice, for example, yield, nutritional characteristics and eating quality are the most important 
economic traits. The first two traits are mostly determined by the composition and accumulation of 
carbohydrates, proteins, and minerals during grain filling, and the latter by the interaction of various 

1 5 enzymes to produce the final structure of the starch at the molecular and granule levels. Manipulation 
of these pathways results in significant improvement in the nutritional valueJor example, reduction of 
the amounts of even one enzyme, granule-bound starch synthase, in the starch biosynthetic pathway 
can dramatically affect the eating quality, resulting in softer, less sticky cooked rice. Some genes 
participating in nutrient partitioning during rice grain filling and affecting starch quality have been 

20 previously identified. However, genes participated in these processes and their transcriptional 
controls are poorly understood. 

Within the scope of the present invention a set of genes is now provided which were shown to be 
involved in the grain filling process based on their mRNA expression characteristics. The genes 
25 within this subset are preferentially up-regulated and share a similar expression pattern during the 
process of grain filling. The expression levels of those genes increase synchronously during grain 
development while the encoded gene products are active in different pathways. The genes within this 
subgroup, representative examples of which are provided in the Sequence Listing, are thus useful 
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tools for generating plants which produce grain with modified compositional characteristics leading to 
improved nutritional properties 

One of the main objectives of the present invention is thus to provide a polynucleotide 
comprising a nucleotide sequence encoding a polypeptide the expression of which is up-regulated 
5 during grain filling and the use of said molecule for modifying the nutritional composition and quality 
of plant grain. 

The majority of the genes within this group encode protein products that are directly involved 
in or associated with three major pathways of nutrition partitioning: the synthesis and transport of (1) 
carbohydrates, (2) proteins, and (3) fatty acids. 
10 The most dramatic increase in relative mRNA expression levels is shown by those genes 

whose products control the synthesis of carbohydrates and proteins and can be found in the 
endosperm of the developing seed, which is the main sink for plant nutrients. 

The other group of genes which shows a significant increase in relative mRNA expression 
levels comprises genes that are involved in and in control of fatty acid biosynthesis. These genes have 
15 a more balanced expression between the embryo and endosperm. 

In one embodiment the invention thus relates to a subset of isolated nucleic acid molecules 
comprising a nucleotide sequence encoding a polypeptide that is involved in at least one of the major 
pathways of nutrition partitioning selected from the group consisting of synthesis, transport, 
metabolism or degradation of carbohydrates, proteins, and fatty acids. 
20 Another subset of nucleic acid molecules provided herein comprises a number of nucleic acids 

that encode different transporters, such as sugar transporters, ABC transporters, amino acid/peptide 
transporters, phosphate transporters, and nitrate transporters. 

Still another subset of nucleic acid molecules that is provided as part of the invention 
comprises nucleic acid molecules that are involved in the transcriptional control of the highly 
25 coordinated grain filling process. 

Further subsets of nucleic acid molecules provided herein comprise nucleic acid molecules 
the expression products of which are associated with amino acid metabolism; signal transduction; 
and stress regulation, respectively. 
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In a collective embodiment applicable to all of the nucleic acid molecules disclosed herein, 
the invention relates to the use of the nucleic acid molecules according to the invention as 
hybridization probes, for chromosome and gene mapping, in PCR technologies, in the production of 
sense or antisense nucleic acids, in screening for new therapeutic molecules, in production of plants 
5 and seeds having desirable, inheritable, commercially useful phenotypes, or in discovery of inhibitory 
compounds.. 

The invention further relates to any polypeptides encoded by the nucleic acid molecules 
according to the invention, or any antigene sequences thereof, which have numerous applications 
using techniques that are known to those skilled in the art of molecular biology, biotechnology, 

1 0 biochemistry, genetics, physiology or pathology. 

In a further collective embodiment, the present invention provides the ability to modulate the 
grain filling process, by over-expressing, under-expressing or knocking out one or more of the genes 
disclosed herein or their gene products, in a plant cell, in vitro or in planta. Expression vectors 
comprising at least one nucleic acid molecule according to the invention, or any antigenes thereof, 

15 operably linked to at least one suitable promoter and/or regulatory sequence can be used to study 
the role of polypeptides encoded by said sequences, for example by transforming a host cell with 
said expression vector and measuring the effects of overexpression and underexpression of said 
nucleic acid molecules. Suitable promoter and/or regulatory sequences include especially those that 
are preferentially or specifically active in plant grain tissue such as, for example, the grain endosperm 

20 or the grain embryo. A host cell transformed with at least one expression vector comprising at least 
one nucleic acid molecule of the invention, operably linked to suitable promoters and/or regulatory 
sequences, can be useful to produce a plant grain with improved nutritional or dietary properties. 

In a further collective embodiment, the present invention provides a transformed plant host 
cell, or one obtained through breeding, capable of over-expressing, under-expressing, or having a 

25 knock out of at least one of the genes according to the invention and/or their gene products. 

Such a plant cell, transformed with at least one expression vector comprising a nucleic acid 
molecule of the invention, operably linked to suitable promoters and/or regulatory sequences, can be 
used to regenerate plant tissue or an entire plant, or seed there from, in which the effects of 
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expression, including overexpression or underexpression, of the introduced sequence or sequences 
can be measured in vitro or in planta. 

In a further embodiment the present invention provides nucleotide sequences including regions 
of nucleotide sequence encoding polypeptides having homology to at least one functional protein 
5 domain (FPD). Embodiments of the invention further provide polypeptides including regions of 
amino acid sequence having homology to an FPD. In cases where the polypeptide has homology to 
an FPD in the same or closely related species, the polypeptide may represent a paralogous sequence 
or paralog, or may represent a variant allele of a gene encoding the FPD. In cases where the 
polypeptide has homology to an FPD in another species, including other plant species and especially 

10 non-plant species, polypeptides may represent orthologous sequences, or orthologs, of the FPD. 

In a further collective embodiment of the invention the nucleic acid molecules disclosed herein 
or respresentative parts thereof can be used in hybridization-based assays for detecting and 
identifying nucleic acid molecules that encode protein products that are involved in the grain filling 
process, more particularly in at least one of the major pathways of nutrition partitioning selected from 

15 the group consisting of synthesis, transport, metabolism or degradation of carbohydrates, proteins, 
and fatty acids, in plants other than rice, but especially in plants belonging to the cereal group. 

Embodiments of the present invention provide a unique oligonucleotide having a sequence 
identical to or complementary to a region of a polynucleotide sequence encoding at least a portion of 
a homologue of a protein according to the invention representatives of which are identified by SEQ 

20 ID NOs 2 - 462, 502-5 1 2, and 5 1 4-642 provided in the Sequence Listing and/or an FPD thereof, 
the oligonucleotide being identified by the methods disclosed herein. In one embodiment, the unique 
oligonucleotide has a length of between 12 and 250 nucleotide bases. 

Embodiments of the present invention also provide a nucleotide microarray comprising the 
unique oligonucleotide having a sequence identical to or complementary to a region of polynucleotide 

25 sequence encoding at least a portion of a homologue of a protein according to the invention 
representatives of which are identified by SEQ ID NOs: 2 - 462, 502-512, and 514-642 
provided in the Sequence Listing and/or an FPD thereof. Preferably, the microarray includes a 
plurality of different, unique oligonucleotides, the sequences corresponding to a plurality of 
homologues of a protein according to the invention representatives of which are identified by the 
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SEQ ID NOs provided in the Sequence Listing and/or an FPD thereof. Equally preferably, the 
microarray contains at least about 96 different unique oligonucleotides, wherein each of the 96 
different unique oligonucleotides has a sequence that is identical, complementary, or substantial 
similarity to a segment of a nucleotide sequence as given in SEQ ID NOs: 1 - 46 1 , 50 1 - 5 1 1 , and 
5 5 1 3-641 provided in the Sequence Listing. 

Embodiments of the present invention also provide a kit for detecting the presence of a 
polynucleotide, the kit containing a firet nucleotide probe which can hybridize with a region of a 
nucleotide sequence including the nucleotide sequences of SEQ ID NOs: 1 - 46 1 provided in the 
Sequence Listing, a fragment or a variant thereof, and a complementary sequence thereto, the kit 
10 further containing at least one additional component such as, for example: a second nucleotide probe, 
a buffer, an enzyme, a label, a molecular weight standard, a reaction chamber, and a micropipette 
tip. 

Embodiments of the present invention further provide a kit for detecting the presence of a 
polypeptide, the kit containing a first probe which can hybridize with a region of a polypeptide 

15 including the amino acid sequences of SEQ ID NOs: 2 - 462,, 502-512, and 5 14-642 provided in 
the Sequence Listing, a fragment or a variant thereof, and optionally, the kit further containing at least 
one additional component such as, for example: a probe, a buffer, an enzyme, a label, a molecular 
weight standard, a reaction chamber, and a micropipette tip. Probes useful in kit embodiments 
include antibodies, affinity tags, protein A, protein G, or protein-binding substances including 

20 chromatographic media. 

An additional aspect provides a method for selecting plants, for example cereals, having 
an altered carbohydrate, protein or fatty acid content and/or composition of the grain comprising 
obtaining nucleic acid molecules from the plants to be selected; contacting the nucleic acid molecules 
with one or more probes that selectively hybridize under stringent or highly stringent conditions to a 

25 nucleic acid sequence selected from the group consisting of SEQ ID NOs. 1-461, 501-51 1, and 
513-641; detecting the hybridization of the one or more probes to the nucleic acid sequences 
wherein the presence of the hybridization indicates the presence of a gene associated with altered 
carbohydrate, protein or fatty acid content and/or composition of the grain; and selecting plants on 
the basis of the presence or absence of such hybridization. In one embodiment, marker- assisted 
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selection is accomplished in rice. In another embodiment, marker assisted selection is accomplished 
in wheat using one or more probes which selectively hybridize under stringent or highly stringent 
conditions to sequences selected from the group consisting of SEQ ID NOs. 951-1105. In yet 
another embodiment, marker assisted selection is accomplished in maize or com using one or more 

5 probes which selectively hybridize under stringent or highly stringent conditions to sequences selected 
from the group consisting of SEQ ID NOs. 1 106-1201. In still another embodiment, marker 
assisted selection is accomplished in banana using one or more probes which selectively hybridize 
under stringent or highly stringent conditions to sequences selected from the group consisting of SEQ 
ID NOs. 884-950. In each case marker-assisted selection can be accomplished using a probe or 

10 probes to a single sequence or multiple sequences. If multiple sequences are used they can be used 
simultaneously or sequentially. 

In a further embodiment of the invention a computer readable medium containing one or more 
of the nucleotide sequences of the invention is provided as well as methods of use for the computer 
readable medium. This medium allows a nucleotide sequence corresponding to at least one of the 

15 sequences selected from the group consisting of SEQ ID NOs: 1 - 46 1 , 50 1 -5 1 1 , and 5 1 3-64 1 
and 884 - 1201 provided in the Sequence Listing (open reading frames or fragments thereof), to be 
used as a reference sequence to search against a database. This medium also allows for computer- 
based manipulation of a nucleotide sequence corresponding to at least one of the sequences selected 
from the group consisting ofSEQ ID NOs: 1 -461,501-511, and 513-641, 884- 1201 provided 

20 in the Sequence Listing. 

Further aspects, features and advantages of this invention will become apparent from the detailed 
description of the preferred embodiments that follow. 

A further aspect provides a computer readable medium having stored thereon computer 
executable instructions for performing a method comprising receiving data on nucleotide sequence 

25 expression in a test plant of at least one nucleic acid molecule having at least 70%, at least 80%, at 
least 90% or at least 95%, sequence identity to a nucleotide sequence selected from the group 
consisting of SEQ ID NOs: 1 -461,501-511, and 513-641; and 884- 1201 and comparing 
expression data from said test plant to expression data for the same nucleotide sequence or 
sequences in a plant during grain filling. 
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Brief Description of the Sequence Listing 

In the following, a brief description of the sequences in the Sequence Listing is provided: 

Odd numbered SEQ ID NOs: 1 - 461 are representing a first sub-group (sub-group I) of 
5 polynucleotides comprising nucleotide sequences which encode polypeptides that are up-regulated 
during grain filling and are described in Tables 1-11 below. 

Even numbered SEQ ID NOs:2-462 are protein sequences encoded by the immediately 
preceding nucleotide sequence, e.g., SEQ ID NO:2 is the protein encoded by the nucleotide 
sequence of SEQ ID NO: 1 , SEQ ID NO:4 is the protein encoded by the nucleotide sequence of 
10 SEQIDNO:3,etc. 

Odd numbered SEQ ID NOs: 501 - 51 1 are representing a second sub-group (sub-group 
II) of polynucleotides comprising rice cDN A sequences. The correlation between the sequences in 
sub-groups I and D is illustrated in Table 1 3 

Even numbered SEQ ID NOs:502 - 512 are protein sequences encoded by the immediately 
1 5 preceding nucleotide sequence. 

Odd numbered SEQ ID NOs: 513-641 are representing a third sub-group (sub-group III) 
of polynucleotides comprising nucleotide sequences that have homologies between 80% and 99.9% 
to the nucleotide sequences of sub-group I and possible variants or familiy members of rice 
sequences provided in SEQ ID NOs: 1-461. The correlation between the sequences in sub-groups I 
20 and III is illustrated in Table 12 

Even numbered SEQ ID NOs:514 - 642 are protein sequences encoded by the immediately 
preceding nucleotide sequence. 

SEQ ID NOs: 643 - 883 are promoter sequences 

SEQ ID NOs: 884 - 950 are banana sequences which show homology to rice "grain filling" 

25 genes. 

SEQ ID NOs: 95 1 - 1 105 are wheat sequences which show homology to rice "grain filling" 

genes. 

SEQ ID NOs: 1 106 - 1201 are maize sequences which show homology to rice "grain 
filling" genes. 
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Definitions 

For clarity, certain terms used in the specification are defined and presented as follows: 
The term "gene" is used broadly to refer to any segment of nucleic acid associated with a 
5 biological function. Thus, genes include coding sequences and/or the regulatory sequences required 
for their expression. For example, gene refers to a nucleic acid fragment that expresses mRNA or 
functional RNA, or encodes a specific protein, and which includes regulatory sequences. Genes also 
include nonexpressed DNA segments that, for example, form recognition sequences for other 
proteins. Genes can be obtained from a variety of sources, including cloning from a source of 
10 interest or synthesizing from known or predicted sequence information, and may include sequences 
designed to have desired parameters. 

The term "native" or 'Svild type" gene refers to a gene that is present in the genome of an 
untransformed cell, i.e., a cell not having a known mutation. 

A "marker gene" encodes a selectable or screenable trait. 
15 The term "chimeric gene" refers to any gene that contains 1) DNA sequences, including 

regulatory and coding sequences, that are not found together in nature, or 2) sequences encoding 
parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. 
Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are 
derived from different sources, or comprise regulatory sequences and coding sequences derived 
20 from the same source, but arranged in a manner different from that found in nature. 

A "transgene" refers to a gene that has been introduced into the genome by transformation 
and is stably maintained. Transgenes may include, for example, genes that are either heterologous or 
homologous to the genes of a particular plant to be transformed. Additionally, transgenes may 
comprise native genes inserted into a non-native organism, or chimeric genes. The term "endogenous 
25 gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene 
refers to a gene not normally found in the host organism but that is introduced by gene transfer. 

An "oligonucleotide" corresponding to a nucleotide sequence of the invention, e.g., for use in 
probing or amplification reactions, may be about 30 or fewer nucleotides in length (e.g., 9, 12, 15, 
18, 20, 21 or 24, or any number between 9 and 30). Generally specific primers are upwards of 14 
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nucleotides in length. For optimum specificity and cost effectiveness, primers of 16 to 24 nucleotides 
in length may be preferred. Those skilled in the art are well versed in the design of primers for use 
processes such as PCR. If required, probing can be done with entire restriction fragments of the 
gene disclosed herein which may be 100's or even 1000's of nucleotides in length. 

5 The terms "protein," "peptide" and "polypeptide" are used interchangeably herein. 

The nucleotide sequences of the invention can be introduced into any plant. The genes to be 
introduced can be conveniently used in expression cassettes for introduction and expression in any 
plant of interest. Such expression cassettes will comprise the transcriptional initiation region of the 
invention linked to a nucleotide sequence of interest. Preferred promoters include constitutive, 

10 tissue-specific, developmental- specific, inducible and/or viral promoters. Such an expression 

cassette is provided with a plurality of restriction sites for insertion of the gene of interest to be under 
the transcriptional regulation of the regulatory regions. The expression cassette may additionally 
contain selectable marker genes. The cassette will include in the 5-3' direction of transcription, a 
transcriptional and translational initiation region, a DNA sequence of interest, and a transcriptional 

15 and translational termination region functional in plants. The termination region may be native with 
the transcriptional initiation region, may be native with the DNA sequence of interest, or may be 
derived from another source. Convenient termination regions are available from the Ti-plasmid of A. 
tumefaciertSy such as the octopine synthase and nopaline synthase termination regions. See also, 
Guerineau et al., 1991; Proudfoot, 1991; Sanfacon et al., 1991; Mogen et al., 1990; Munroe et al., 

20 1 990; Ballas et al., 1 989; Joshi et al., 1 987. 

"Coding sequence" refers to a DNA or RNA sequence that codes for a specific amino acid 
sequence and excludes the non-coding sequences. It may constitute an "uninterrupted coding 
sequence", i.e., lacking an intron, such as in a cDNA or it may include one or more introns bounded 
by appropriate splice junctions. An "intron" is a sequence of RNA which is contained in the primary 

25 transcript but which is removed through cleavage and re- ligation of the RNA within the cell to create 
the mature mRNA that can be translated into a protein. 

The terms "open reading frame" and "ORF" refer to the amino acid sequence encoded 
between translation initiation and termination codons of a coding sequence. The terms "initiation 
codon" and "termination codon" refer to a unit of three adjacent nucleotides ('codon 1 ) in a coding 
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sequence that specifies initiation and chain termination, respectively, of protein synthesis (mRNA 
translation). 

A "functional RNA" refers to an antisense RNA, ribozyme, or other RNA that is not 
translated. 

5 The term "RNA transcript" refers to the product resulting from RNA polymerase catalyzed 

transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the 
DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from 
posttranscriptional processing of the primary transcript and is referred to as the mature RNA. 
"Messenger RNA" (mRNA) refers to the RNA that is without introns and that can be translated into 

10 protein by the cell. "cDNA" refers to a single- or a double-stranded DNA that is complementary to 
and derived from mRNA. 

"Regulatory sequences" and "suitable regulatory sequences" each refer to nucleotide 
sequences located upstream (5 f non-coding sequences), within, or downstream (3* non-coding 
sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, 

15 or translation of the associated coding sequence. Regulatory sequences include enhancers, 

promoters, translation leader sequences, introns, and polyadenylation signal sequences. They include 
natural and synthetic sequences as well as sequences which may be a combination of synthetic and 
natural sequences. As is noted above, the term "suitable regulatory sequences" is not limited to 
promoters. 

20 "5' non-coding sequence" refers to a nucleotide sequence located 5' (upstream) to the coding 

sequence. It is present in the fully processed mRNA upstream of the initiation codon and may affect 
processing of the primary transcript to mRNA, mRNA stability or translation efficiency (Turner et al., 
1995). 

"3" non-coding sequence" refers to nucleotide sequences located 3' (downstream) to a 
25 coding sequence and include polyadenylation signal sequences and other sequences encoding 

regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation 
signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the 
mRNA precursor. The use of different 3' non-coding sequences is exemplified by lngelbrecht et al., 
1989. 



- 10- 



WO 03/000905 



PCT/IB02/02450 



The term "translation leader sequence" refers to that DNA sequence portion of a gene 
between the promoter and coding sequence that is transcribed into RNA and is present in the fully 
processed mRNA upstream (5 1 ) of the translation start codon. The translation leader sequence may 
affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. 
5 "Signal peptide" refers to the amino terminal extension of a polypeptide, which is translated 

in conjunction with the polypeptide forming a precursor peptide and which is required for its entrance 
into the secretory pathway. The term "signal sequence" refers to a nucleotide sequence that encodes 
the signal peptide. 

"Promoter" refers to a nucleotide sequence, usually upstream (5') to its coding sequence, 

10 which controls the expression of the coding sequence by providing the recognition for RNA 
polymerase and other factors required for proper transcription. "Promoter" includes a minimal 
promoter that is a short DNA sequence comprised of a TATA box and other sequences that serve 
to specify the site of transcription initiation, to which regulatory elements are added for control of 
expression. "Promoter" also refers to a nucleotide sequence that includes a minimal promoter plus 

15 regulatory elements that is capable of controlling the expression of a coding sequence or functional 
RNA. This type of promoter sequence consists of proximal and more distal upstream elements, the 
latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which 
can stimulate promoter activity and may be an innate element of the promoter or a heterologous 
element inserted to enhance the level or tissue specificity of a promoter. It is capable of operating in 

20 both orientations (normal or flipped), and is capable of functioning even when moved either upstream 
or downstream from the promoter. Both enhancers and other upstream promoter elements bind 
sequence- specific DNA-binding proteins that mediate their effects. Promoters may be derived in 
their entirety from a native gene, or be composed of different elements derived from different 
promoters found in nature, or even be comprised of synthetic DNA segments. A promoter may also 

25 contain DNA sequences that are involved in the binding of protein factors which control the 
effectiveness of transcription initiation in response to physiological or developmental conditions. 

The "initiation site" is the position surrounding the first nucleotide that is part of the 
transcribed sequence, which is also defined as position +1. With respect to this site all other 
sequences of the gene and its controlling regions are numbered. Downstream sequences (i.e., further 
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protein encoding sequences in the 3* direction) are denominated positive, while upstream sequences 
(mostly of the controlling regions in the 5* direction) are denominated negative. 

Promoter elements, particularly a TATA element, that are inactive or that have greatly 
reduced promoter activity in the absence of upstream activation are referred to as "minimal or core 

5 promoters." In the presence of a suitable transcription factor, the minimal promoter functions to 
permit transcription. A "minimal or core promoter" thus consists only of all basal elements needed 
for transcription initiation, e.g., a TATA box and/or an initiator. 

"Constitutive expression" refers to expression using a constitutive or regulated promoter. 
"Conditional" and "regulated expression" refer to expression controlled by a regulated promoter. 

10 "Constitutive promoter" refers to a promoter that is able to express the open reading frame 

(ORF) that it controls in all or nearly all of the plant tissues during all or nearly all developmental 
stages of the plant. Each of the transcription-activating elements do not exhibit an absolute tissue- 
specificity, but mediate transcriptional activation in most plant parts at a level of > 1 % of the level 
reached in the part of the plant in which transcription is most active. 

15 "Regulated promoter" refers to promoters that direct gene expression not constitutively, but 

in a temporally- and/or spatially-regulated manner, and includes both tissue-specific and inducible 
promoters. It includes natural and synthetic sequences as well as sequences which may be a 
combination of synthetic and natural sequences. Different promoters may direct the expression of a 
gene in different tissues or cell types, or at different stages of development, or in response to different 

20 environmental conditions. New promoters of various types useful in plant cells are constantly being 
discovered, numerous examples may be found in the compilation by Okamuro et al. (1989). Typical 
regulated promoters useful in plants include but are not limited to safener- inducible promoters, 
promoters derived from the tetracycline- inducible system, promoters derived from salicylate- 
inducible systems, promoters derived from alcohol- inducible systems, promoters derived from 

25 glucocorticoid- inducible system, promoters derived from pathogen- inducible systems, and promoters 
derived from ecdysome-inducible systems. 

'Tissue-specific promoter" refers to regulated promoters that are not expressed in all plant 
cells but only in one or more cell types in specific organs (such as leaves or seeds), specific tissues 
(such as embryo or cotyledon), or specific cell types (such as leaf parenchyma or seed storage cells). 
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These also include promoters that are temporally regulated, such as in early or late embryogenesis, 
during fruit ripening in developing seeds or fruit, in fully differentiated leaf, or at the onset of 
senescence. 

"Inducible promoter" refers to those regulated promoters that can be turned on in one or 
5 more cell types by an external stimulus, such as a chemical, light, hormone, stress, or a pathogen. 

"Operably- linked" refers to the association of nucleic acid sequences on single nucleic acid 
fragment so that the function of one is affected by the other. For example, a regulatory DNA 
sequence is said to be "operably linked to" or "associated with" a DNA sequence that codes for an 
RNA or a polypeptide if the two sequences are situated such that the regulatory DNA sequence 
10 affects expression of the coding DNA sequence (i.e., that the coding sequence or functional RNA is 
under the transcriptional control of the promoter). Coding sequences can be operably-linked to 
regulatory sequences in sense or antisense orientation. 

"Expression" refers to the transcription and/or translation of an endogenous gene, ORF or 
portion thereof, or a transgene in plants. For example, in the case of antisense constructs, expression 
15 may refer to the transcription of the antisense DNA only. In addition, expression refers to the 
transcription and stable accumulation of sense (mRNA) or functional RNA. Expression may also 
refer to the production of protein. 

"Specific expression" is the expression of gene products which is limited to one or a few 
plant tissues (spatial limitation) and/or to one or a few plant developmental stages (temporal 
20 limitation). It is acknowledged that hardly a true specificity exists: promoters seem to be preferably 
switch on in some tissues, while in other tissues there can be no or only little activity. This 
phenomenon is known as leaky expression. However, with specific expression in this invention is 
meant preferable expression in one or a few plant tissues. 

The "expression pattern" of a promoter (with or without enhancer) is the pattern of 
25 expression levels which shows where in the plant and in what developmental stage transcription is 
initiated by said promoter. Expression patterns of a set of promoters are said to be complementary 
when the expression pattern of one promoter shows little overlap with the expression pattern of the 
other promoter. The level of expression of a promoter can be determined by measuring the 'steady 
state* concentration of a standard transcribed reporter mRNA. This measurement is indirect since 
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the concentration of the reporter mKNA is dependent not only on its synthesis rate, but also on the 
rate with which the mRNA is degraded. Therefore, the steady state level is the product of synthesis 
rates and degradation rates. 

The rate of degradation can however be considered to proceed at a fixed rate when the 
transcribed sequences are identical, and thus this value can serve as a measure of synthesis rates. 
When promoters are compared in this way techniques available to those skilled in the art are 
hybridization S 1-RNAse analysis, northern blots and competitive RT-PCR. This list of techniques in 
no way represents all available techniques, but rather describes commonly used procedures used to 
analyze transcription activity and expression levels of mRNA. 

The analysis of transcription start points in practically all promoters has revealed that there is 
usually no single base at which transcription starts, but rather a more or less clustered set of initiation 
sites, each of which accounts for some start points of the mRNA. Since this distribution varies from 
promoter to promoter the sequences of the reporter mRNA in each of the populations would differ 
from each other. Since each mRNA species is more or less prone to degradation, no single 
degradation rate can be expected for different reporter mRNAs. It has been shown for various 
eukaryotic promoter sequences that the sequence surrounding the initiation site ('initiator') plays an 
important role in determining the level of RNA expression directed by that specific promoter. This 
includes also part of the transcribed sequences. The direct fiision of promoter to reporter sequences 
would therefore lead to suboptimal levels of transcription. 

A commonly used procedure to analyze expression patterns and levels is through 
determination of the 'steady state* level of protein accumulation in a cell. Commonly used candidates 
for the reporter gene, known to those skilled in the art are P -glucuronidase (GUS), chloramphenicol 
acetyl transferase (CAT) and proteins with fluorescent properties, such as green fluorescent protein 
(GFP) from Aequora victoria. In principle, however, many more proteins are suitable for this 
purpose, provided the protein does not interfere with essential plant functions. For quantification and 
determination of localization a number of tools are suited. Detection systems can readily be created 
or are available which are based on, e.g., immunochemical, enzymatic, fluorescent detection and 
quantification. Protein levels can be determined in plant tissue extracts or in intact tissue using in situ 
analysis of protein expression. 
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Generally, individual transformed lines with one chimeric promoter reporter construct will 
vary in their levels of expression of the reporter gene. Also frequently observed is the phenomenon 
that such transformants do not express any detectable product (RNA or protein). The variability in 
expression is commonly ascribed to 'position effects', although the molecular mechanisms underlying 
5 this inactivity are usually not clear. 

"Overexpression" refers to the level of expression in transgenic cells or organisms that 
exceeds levels of expression in normal or untransformed (nontransgenic) cells or organisms. 

"Antisense inhibition" refers to the production of antisense RNA transcripts capable of 
suppressing the expression of protein from an endogenous gene or a transgene. 
10 "Gene silencing" refers to homology-dependent suppression of viral genes, transgenes, or 

endogenous nuclear genes. Gene silencing may be transcriptional, when the suppression is due to 
decreased transcription of the affected genes, or post-transcriptional, when the suppression is due to 
increased turnover (degradation) of RNA species homologous to the affected genes (English et ah, 
1996). Gene silencing includes virus- induced gene silencing (Ruiz et al. 1998). 
15 The terms "heterologous DNA sequence," "exogenous DNA segment" or "heterologous 

nucleic acid," as used herein, each refer to a sequence that originates from a source foreign to the 
particular host cell or, if from the same source, is modified from its original form. Thus, a 
heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has 
been modified through, for example, the use of DNA shuffling. The terms also include non- naturally 
20 occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA 
segment that is foreign or heterologous to the cell, or homologous to the cell but in a position within 
the host cell nucleic acid in which the element is not ordinarily found. Exogenous DNA segments are 
expressed to yield exogenous polypeptides. A "homologous" DNA sequence is a DNA sequence 
that is naturally associated with a host cell into which it is introduced. 
25 "Homologous to" in the context of nucleotide sequence identity refers to the similarity 

between the nucleotide sequence of two nucleic acid molecules or between the amino acid 
sequences of two protein molecules. As used herein, "homology" and "homologous" refer to an 
evaluation of the similarity between two sequences based on measurements of sequence identity 
adjusted for variables including gaps, insertions, frame shifts, conservative substitutions, and 
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sequencing errors, as described below. Two nucleotide sequences or polypeptides are the to be 
"identical" if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is 
the same when aligned for maximum correspondence as described below. The term 
"complementary to" is used herein to mean that the sequence can form a Watson-Crick base pair 
with a reference polynucleotide sequence. Complementary sequences can include nucleotides, such 
as inosine, that neither disrupt Watson-Crick base pairing nor contribute to the pairing. A "reverse 
complement" of a sequence corresponds to the complementary sequence, but in the opposite 
orientation of bases from 5' to 3', or to the complement of the primary sequence, if the primary 
sequence is in a reverse orientation of bases from 5* to 3\ 

Homology is evaluated using any of the variety of sequence comparison algorithms and 
programs known in the art. Such algorithms and programs include, but are by no means limited to, 
TBLASTN, BLASTP, FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, Proc Natl 
Acad Sci (USA) 85:2444 (1988); Altschul et ai,J Mol Biol 21 5:403 (1990)). In a particularly 
preferred embodiment, protein and nucleic acid sequence homologies are evaluated using the Basic 
Local Alignment Search Tool ("BLAST") which is well known in the art (Karl in and Altschul, Proc 
Natl Acad Sci USA 87:2264(1990); Altschul et al ( 1 990) supra, Altschul et at., Nucleic Acids 
Res 25:3389 (1997)). In particular, five specific BLAST programs are used to perform the 
following task: 

(1) BLASTP and BLAST3 compare an amino acid queiy sequence against a protein 
sequence database; 

(2) BLASTN compares a nucleotide query sequence against a nucleotide sequence 
database; 

(3) BLASTX compares the six- frame conceptual translation products of a query nucleotide 
sequence (both strands) against a protein sequence database; 

(4) TBLASTN compares a query protein sequence against a nucleotide sequence database 
translated in all six reading frames (both strands); and 

(5) TBLASTX compares the six- frame translations of a nucleotide query sequence against 
the six- frame translations of a nucleotide sequence database. 
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The BLAST programs identify homologous sequences by identifying similar segments, which are 
referred to herein as "high- scoring segment pairs," between a query amino or nucleic acid sequence 
and a test sequence which is preferably obtained from a protein or nucleic acid sequence database. 
High-scoring segment pairs are preferably identified (aligned) by means of a scoring matrix selected 

5 from the many scoring matrices known in the art. Preferably, the scoring matrix used is the 

BLOSUM62 matrix (Gonnet et ai, Science 256: 1443 (1992); Henikoffand HenikofT, Proteins 
17:49 (1993)). Likewise, the PAM or PAM250 matrices may also be used (Schwartz and Dayhoff, 
In Atlas of Protein Sequence and Structure, Dayhoff, ed., Natl. Biomed. Res. Found., pp. 353- 
358 (1978)). The BLAST programs evaluate the statistical significance of all high- scoring segment 

10 pairs identified, and preferably selects those segments which satisfy a user-specified threshold of 
significance, such as a user-specified percent homology. Preferably, the statistical significance of a 
high- scoring segment pair is evaluated using the statistical significance formula of Karlin (Karl in and 
A\tschu\ (\990) supra). 

"Percentage of sequence identity" can be determined from alignments performed using 

15 algorithms known in the art. Alignment of nucleotide or polypeptide sequences for comparison may 
be conducted by the local homology algorithm of Smith and Waterman (Add APL Math 2:482 
(1981)), by the homology alignment algorithm of Needleman and Wunsch (JMol Biol 48:443 
( 1 970)), by the search for similarity method of Pearson and Lipman (Proc Natl Acad Sci USA 
85:2444 (1988)), by computerized implementations of these algorithms (GAP, BESTFIT, BLAST, 

20 PASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group), 
or by inspection. When two sequences have been identified for comparison, GAP and BESTFIT 
are preferably employed to determine their optimal alignment. Typically, the default values of 5.00 
for gap weight and 0.30 for gap weight length are used. In a preferred embodiment, percenty 
identity is determined using the GAP program for global alignment using default parameters, using the 

25 version of GAP found in the GCG package (Wisconsin Package Version 1 0. 1 , Genetics Computer 
Group, 575 Science Dr., Madison, Wisconsin). 

"Percentage of sequence identity" is determined by comparing two optimally aligned 
sequences over a comparison window, wherein the portion of the sequence in the comparison 
window may include additions or deletions, including for example gaps or overhangs, as compared 
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to the reference sequence (which does not include additions or deletions) for optimal alignment of the 
two sequences. The percentage is calculated by determining the number of positions at which the 
identical nucleotide base or amino acid residue occurs in both sequences to yield the number of 
matched positions, dividing the number of matched positions by the total number of positions in the 

5 window of comparison and multiplying the result by 1 00 to yield the percentage of sequence identity. 
In a broad sense, the term "substantially similar", when used herein with respect to a 
nucleotide sequence, means a nucleotide sequence corresponding to a reference nucleotide 
sequence, wherein the corresponding sequence encodes a polypeptide having substantially the same 
structure as the polypeptide encoded by the reference nucleotide sequence. Desirably, the 

10 substantially similar nucleotide sequence encodes the polypeptide encoded by the reference 

nucleotide sequence. Preferably, "substantially similar" refers to nucleotide sequences having at least 
50% sequence identity, preferably at least 60%, 70%, 80% or 85%, more preferably at least 90% 
or 95%, and even more preferably, at least 96%, 97% or 99% sequence identity compared to a 
reference sequence containing nucleotide sequences of Table I, that encode a protein having at least 

15 50% identity, more preferably at least 85% identity, yet still more preferably at least 90% identity to 
a region of sequence of a BIOPATH protein and/or an FPD, wherein the protein sequence 
comparisons are conducted using GAP analysis as described below. Also, "substantially similar" 
preferably also refers to nucleotide sequences having at least 50% identity, more preferably at least 
80% identity, still more preferably 95% identity, yet still more preferably at least 99% identity, to a 

20 region of nucleotide sequence encoding a BIOPATH protein and/or an FPD, wherein the nucleotide 
sequence comparisons are conducted using GAP analysis as described below. The term 
"substantially similar" is specifically intended to include nucleotide sequences wherein the sequence 
has been modified to optimize expression in particular cells. 

A polynucleotide including a nucleotide sequence "substantially similar" to the reference 

25 nucleotide sequence preferably hybridizes to a polynucleotide including the reference nucleotide 
sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing 
in 2X SSC, 0.1% SDS at 50°C, more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaP0 4 , 1 mM EDTA at 50°C with washing in IX SSC, 0.1% SDS at 50°C, more desirably still in 
7% sodium dodecyl sulfate (SDS), 0.5 M NaPO<, 1 mM EDTA at 50°C with washing in 0.5X 
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SSC, 0.1% SDS at 50°C, preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM 
EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 50°C, more preferably in 7% sodium 
dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1 X SSC, 0.1% 
SDS at 65°C. 

5 The term "substantially similar", when used herein with respect to a protein or polypeptide, 

means a protein or polypeptide corresponding to a reference protein, wherein the protein has 
substantially the same structure and function as the reference protein, where only changes in amino 
acids sequence that do not materially affect the polypeptide function occur. When used for a protein 
or an amino acid sequence the percentage of identity between the substantially similar and the 

10 reference protein or amino acid sequence desirably is preferably at least 30%, more preferably at 
least 40%, 50%, 60%, 70%, 80%, 85%, or 90%, still more preferably at least 95% , still more 
preferably at least 99% with every individual number falling within this range of at least 30% to at 
least 99% also being part of the invention, using default GAP analysis parameters with the University 
of Wisconsin GCG (version 10), SEQWEB application of GAP, based on the algorithm of 

15 Needleman and Wunsch (1 970), supra. As used herein the term "polypeptide of the present 
invention," or any similar term refers to an amino acid sequence encoded by a DNA molecule 
including a nucleotide sequence substantially similar to an AC sequence. Homologs of BIOPATH 
protein and/or FPDs include amino acid sequences that are at least 30% identical to BIOPATH 
protein and/or FPD sequences found in searchable databases, as measured using the parameters 

20 described above. 

"Target gene" refers to a gene on the replicon that expresses the desired target coding 
sequence, functional RNA, or protein. The target gene is not essential for replicon replication. 
Additionally, target genes may comprise native non- viral genes inserted into a non-native organism, 
or chimeric genes, and will be under the control of suitable regulatory sequences. Thus, the 

25 regulatory sequences in the target gene may come from any source, including the virus. Target genes 
may include coding sequences that are either heterologous or homologous to the genes of a particular 
plant to be transformed. However, target genes do not include native viral genes. Typical target 
genes include, but are not limited to genes encoding a structural protein, a seed storage protein, a 
protein that conveys herbicide resistance, and a protein that conveys insect resistance. Proteins 
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encoded by target genes are known as "foreign proteins". The expression of a target gene in a plant 
will typically produce an altered plant trait. 

The term "altered plant trait" means any phenotypic or genotypic change in a transgenic plant 
relative to the wild-type or nor> transgenic plant host. 

5 "Chromosomally- integrated" refers to the integration of a foreign gene or DNA construct into 

the host DNA by covalent bonds. Where genes are not "chromosomally integrated" they may be 
"transiently expressed." Transient expression of a gene refers to the expression of a gene that is not 
integrated into the host chromosome but Junctions independently, either as part of an autonomously 
replicating plasmid or expression cassette, for example, or as part of another biological system such 

10 as a virus. 

The term "transformation" refers to the transfer of a nucleic acid fragment into the genome of 
a host cell, resulting in genetically stable inheritance. Host cells containing the transformed nucleic 
acid fragments are referred to as "transgenic" cells, and organisms comprising transgenic cells are 
referred to as "transgenic organisms". Examples of methods of transformation of plants and plant 
1 5 cells include Agrobacterium- mediated transformation (De Blaere et al., 1 987) and particle 

bombardment technology (Klein et al. 1987; U.S. Patent No. 4,945,050). Whole plants may be 
regenerated from transgenic cells by methods well known to the skilled artisan (see, for example, 
Frommetal., 1990). 

'Transformed," "transgenic," and "recombinant" refer to a host organism such as a bacterium 
20 or a plant into which a heterologous nucleic acid molecule has been introduced. The nucleic acid 
molecule can be stably integrated into the genome generally known in the art and are disclosed in 
Sambrook et al., 1989. See also Innis et al., 1995 and Gelfand, 1995; and Innis and Gelfand, 1 999. 
Known methods of PCR include, but are not limited to, methods using paired primers, nested 
primers, single specific primers, degenerate primers, gene-specific primers, vector- specific primers, 
25 partially mismatched primers, and the like. For example, "transformed," "transformant," and 

"transgenic" plants or calli have been through the transformation process and contain a foreign gene 
integrated into their chromosome. The term "untransfoimed" refers to normal plants that have not 
been through the transformation process. 
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"Transiently transformed" refers to cells in which transgenes and foreign DNA have been 
introduced (for example, by such methods as Agrobacterium- mediated transformation or biolistic 
bombardment), but not selected for stable maintenance. 

"Stably transformed" refers to cells that have been selected and regenerated on a selection 
5 media following transformation. 

"Transient expression" refers to expression in cells in which a virus or a transgene is 
introduced by viral infection or by such methods as Agrobacterium-mediated transformation, 
electroporation, or biolistic bombardment, but not selected for its stable maintenance. 

"Genetically stable" and "heritable" refer to chromosomally-integrated genetic elements that 
10 are stably maintained in the plant and stably inherited by progeny through successive generations. 

"Primary transformant" and 'TO generation" refer to transgenic plants that are of the same 
genetic generation as the tissue which was initially transformed (i.e., not having gone through meiosis 
and fertilization since transformation). 

"Secondary transformants" and the "Tl, T2, T3, etc. generations" refer to transgenic plants 
15 derived from primary transformants through one or more meiotic and fertilization cycles. They may 
be derived by self-fertilization of primary or secondary transformants or crosses of primary or 
secondary transformants with other transformed or untransformed plants. 

"Wild- type" refers to a virus or organism found in nature without any known mutation. 

"Genome" refers to the complete genetic material of an organism. 
20 The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in 
either single- or double-stranded form, composed of monomers (nucleotides) containing a sugar, 
phosphate and a base which is either a purine or pyrimidtne. Unless specifically limited, the term 
encompasses nucleic acids containing known analogs of natural nucleotides which have similar 
binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally 
25 occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly 
encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and 
complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate 
codon substitutions may be achieved by generating sequences in which the third position of one or 
more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et 
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al., 1991; Ohtsuka et al., 1985; Rossolini et al. 1994). A "nucleic acid fragment" is a fraction of a 
given nucleic acid molecule. In higher plants, deoxyribonucleic acid (DNA) is the genetic material 
while ribonucleic acid (RNA) is involved in the transfer of information contained within DNA into 
proteins. The term "nucleotide sequence" refers to a polymer of DNA or RNA which can be single- 

5 or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases capable 
of incorporation into DNA or RNA polymers. The terms "nucleic acid" or "nucleic acid sequence" 
may also be used interchangeably with gene, cDNA, DNA and RNA encoded by a gene. 

The invention encompasses isolated or substantially purified nucleic acid or protein 
compositions. In the context of the present invention, an "isolated" or "purified" DNA molecule or an 

10 "isolated" or "purified" polypeptide is a DNA molecule or polypeptide that, by the hand of man, 
exists apart from its native environment and is therefore not a product of nature. An isolated DNA 
molecule or polypeptide may exist in a purified form or may exist in a non- native environment such 
as, for example, a transgenic host cell. For example, an "isolated" or "purified" nucleic acid molecule 
or protein, or biologically active portion thereof, is substantially free of other cellular material, or 

15 culture medium when produced by recombinant techniques, or substantially free of chemical 

precursors or other chemicals when chemically synthesized. Preferably, an "isolated" nucleic acid is 
free of sequences (preferably protein encoding sequences) that naturally flank the nucleic acid (i.e., 
sequences located at the 5' and 3* ends of the nucleic acid) in the genomic DNA of the organism 
from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic 

20 acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of 

nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from 
which the nucleic acid is derived. A protein that is substantially free of cellular material includes 
preparations of protein or polypeptide having less than about 30%, 20%, 10%, 5%, (by dry weight) 
of contaminating protein. When the protein of the invention, or biologically active portion thereof, is 

25 recombinantly produced, preferably culture medium represents less than about 30%, 20%, 10%, or 
5% (by dry weight) of chemical precursors or non- protein of interest chemicals. 

The nucleotide sequences of the invention include both the naturally occurring sequences as 
well as mutant (variant) forms. Such variants will continue to possess the desired activity, i.e., either 
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promoter activity or the activity of the product encoded by the open reading frame of the non- variant 
nucleotide sequence. 

Thus, by "variants" is intended substantially similar sequences. For nucleotide sequences 
comprising an open reading frame, variants include those sequences that, because of the degeneracy 

5 of the genetic code, encode the identical amino acid sequence of the native protein. Naturally 

occurring allelic variants such as these can be identified with the use of well-known molecular biology 
techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques. 
Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those 
generated, for example, by using site-directed mutagenesis and for open reading frames, encode the 

10 native protein, as well as those that encode a polypeptide having amino acid substitutions relative to 
the native protein. Generally, nucleotide sequence variants of the invention will have at least 40, 50, 
60, to 70%, e.g., preferably 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, to 79%, generally at 
least 80%, e.g., 81%-84%, at least 85%, e.g., 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 
94%, 95%, 96%, 97%, to 98% and 99% nucleotide sequence identity to the native (wild type or 

1 5 endogenous) nucleotide sequence. 

"Conservatively modified variations" of a particular nucleic acid sequence refers to those 
nucleic acid sequences that encode identical or essentially identical amino acid sequences, or where 
the nucleic acid sequence does not encode an amino acid sequence, to essentially identical 
sequences. Because of the degeneracy of the genetic code, a large number of functionally identical 

20 nucleic acids encode any given polypeptide. For instance the codons CGT, CGC, CGA, CGG, 
AGA, and AGG all encode the amino acid arginine. Thus, at every position where an arginine is 
specified by a codon, the codon can be altered to any of the corresponding codons described 
without altering the encoded protein. Such nucleic acid variations are "silent variations" which are 
one species of "conservatively modified variations." Every nucleic acid sequence described herein 

25 which encodes a polypeptide also describes every possible silent variation, except where otherwise 
noted. One of skill will recognize that each codon in a nucleic acid (except ATG, which is ordinarily 
the only codon for methionine) can be modified to yield a functionally identical molecule by standard 
techniques. Accordingly, each "silent variation" of a nucleic acid which encodes a polypeptide is 
implicit in each described sequence. 
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The nucleic acid molecules of the invention can be "optimized" for enhanced expression in 
plants of interest. See, for example, EPA 035472; WO 91/16432; Perlak et al., 1991; and Murray 
et al., 1989. In this manner, the open reading frames in genes or gene fragments can be synthesized 
utilizing plant- preferred codons. See, for example, Campbell and Gowri, 1990 for a discussion of 
5 host-preferred codon usage. Thus, the nucleotide sequences can be optimized for expression in any 
plant. It is recognized that all or any part of the gene sequence may be optimized or synthetic. That 
is, synthetic or partially optimized sequences may also be used. Variant nucleotide sequences and 
proteins also encompass sequences and protein derived from a mutagenic and recombinogenic 
procedure such as DNA shuffling. With such a procedure, one or more different coding sequences 

10 can be manipulated to create a new polypeptide possessing the desired properties. In this manner, 
libraries of recombinant polynucleotides are generated from a population of related sequence 
polynucleotides comprising sequence regions that have substantial sequence identity and can be 
homologously recombined in vitro or in vivo. Strategies for such DNA shuffling are known in the 
art. See, for example, Stemmer, 1994; Stemmer, 1994; Crameri et al., 1997; Moore et al., 1997; 

15 Zhang et al., 1997; Crameri et al., 1998; and U.S. Patent Nos. 5,605,793 and 5,837,458. 

By "variant" polypeptide is intended a polypeptide derived from the native protein by 
deletion (so-called truncation) or addition of one or more amino acids to the N-terminal and/or C- 
terminal end of the native protein; deletion or addition of one or more amino acids at one or more 
sites in the native protein; or substitution of one or more amino acids at one or more sites in the 

20 native protein. Such variants may result from, for example, genetic polymorphism or from human 
manipulation. Methods for such manipulations are generally known in the art. 

Thus, the polypeptides may be altered in various ways including amino acid substitutions, 
deletions, truncations, and insertions. Methods for such manipulations are generally known in the art 
For example, amino acid sequence variants of the polypeptides can be prepared by mutations in the 

25 DNA. Methods for mutagenesis and nucleotide sequence alterations are well known in the art. See, 
for example, Kunkel, 1985; Kunkel et al., 1987; U. S. Patent No. 4,873,1 92; Walker and Gaastra, 
1983 and the references cited therein. Guidance as to appropriate amino acid substitutions that do 
not affect biological activity of the protein of interest may be found in the model of Dayhoff et al. 
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(1 978). Conservative substitutions, such as exchanging one amino acid with another having similar 
properties, are preferred. 

Individual substitutions deletions or additions that alter, add or delete a single amino acid or a 
small percentage of amino acids (typically less than 5%, more typically less than 1%) in an encoded 

5 sequence are "conservatively modified variations," where the alterations result in the substitution of 
an amino acid with a chemically similar amino acid. Conservative substitution tables providing 
functionally similar amino acids are well known in the art. The following five groups each contain 
amino acids that are conservative substitutions for one another Aliphatic: Glycine (G), Alanine (A), 
Valine (V), Leucine (L), Isoleucine (I); Aromatic: Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 

10 Sulfur-containing: Methionine (M), Cysteine (C); Basic: Arginine I, Lysine (K), Histidine (H); Acidic: 
Aspartic acid (D), Glutamic acid (E), Asparagine (N), Glutamine (Q). See also, Creighton, 1984. In 
addition, individual substitutions, deletions or additions which alter, add or delete a single amino acid 
or a small percentage of amino acids in an encoded sequence are also "conservatively modified 
variations." 

15 "Expression cassette" as used herein means a DNA sequence capable of directing 

expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter 
operably linked to the nucleotide sequence of interest which is operably linked to termination signals. 
It also typically comprises sequences required for proper translation of the nucleotide sequence. The 
coding region usually codes for a protein of interest but may also code for a functional RNA of 

20 interest, for example antisense RNA or a nontranslated RNA, in the sense or antisense direction. The 
expression cassette comprising the nucleotide sequence of interest may be chimeric, meaning that at 
least one of its components is heterologous with respect to at least one of its other components. The 
expression cassette may also be one which is naturally occurring but has been obtained in a 
recombinant form useful for heterologous expression. The expression of the nucleotide sequence in 

25 the expression cassette may be under the control of a constitutive promoter or of an inducible 

promoter which initiates transcription only when the host cell is exposed to some particular external 
stimulus. In the case of a multicellular organism, the promoter can also be specific to a particular 
tissue or organ or stage of development. 



-25- 



WO 03/000905 



PCT/IB02/02450 



"Vector" is defined to include, inter alia, any plasmid, cosmid, phage or Agrobacterium 
binary vector in double or single stranded linear or circular form which may or may not be self 
transmissible or mobilizable, and which can transform prokaryotic or eukaryotic host either by 
integration into the cellular genome or exist extrachromosomally (e.g. autonomous replicating plasmid 
5 with an origin of replication). 

Specifically included are shuttle vectors by which is meant a DNA vehicle capable, naturally 
or by design, of replication in two different host organisms, which may be selected from 
actinomycetes and related species, bacteria and eukaryotic (e.g. higher plant, mammalian, yeast or 
fungal cells). 

10 Preferably the nucleic acid in the vector is under the control of, and operably linked to, an 

appropriate promoter or other regulatory elements for transcription in a host cell such as a microbial, 
e.g. bacterial, or plant cell. The vector may be a bi- functional expression vector which functions in 
multiple hosts. In the case of genomic DNA, this may contain its own promoter or other regulatory 
elements and in the case of cDNA this may be under the control of an appropriate promoter or other 

15 regulatory elements for expression in the host cell. 

"Cloning vectors" typically contain one or a small number of restriction endonuclease 
recognition sites at which foreign DNA sequences can be inserted in a determinable fashion without 
loss of essential biological function of the vector, as well as a marker gene that is suitable for use in 
the identification and selection of cells transformed with the cloning vector. Marker genes typically 

20 include genes that provide tetracycline resistance, hygromycin resistance or ampicillin resistance. 

A "transgenic plant" is a plant having one or more plant cells that contain an expression 

vector. 

"Plant tissue" includes differentiated and undifferentiated tissues or plants, including but not 
limited to roots, stems, shoots, leaves, pollen, seeds, tumor tissue and various forms of cells and 
25 culture such as single cells, protoplast, embryos, and callus tissue. The plant tissue may be in plants 
or in organ, tissue or cell culture. 

The following terms are used to describe the sequence relationships between two or more 
nucleic acids or polynucleotides: (a) "reference sequence", (b) "comparison window", (c) "sequence 
identity", (d) "percentage of sequence identity", and (e) "substantial identity". 
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(a) As used herein, "reference sequence" is a defined sequence used as a basis for sequence 
comparison. A reference sequence may be a subset or the entirety of a specified sequence; for 
example, as a segment of a lull length cDNA or gene sequence, or the complete cDNA or gene 
sequence. 

(b) As used herein, "comparison window" makes reference to a contiguous and specified 
segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison 
window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which 
does not comprise additions or deletions) for optimal alignment of the two sequences. Generally, the 
comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 
100, or longer. Those of skill in the art understand that to avoid a high similarity to a reference 
sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically 
introduced and is subtracted from the number of matches. 

Methods of alignment of sequences for comparison are well known in the art. Thus, the 
determination of percent identity between any two sequences can be accomplished using a 
mathematical algorithm. Preferred, non-limiting examples of such mathematical algorithms are the 
algorithm of Myers and Miller, 1988; the local homology algorithm of Smith et al. 1981; the 
homology alignment algorithm of Needleman and Wunsch 1970; the search- for- similarity-method of 
Pearson and Lipman 1988; the algorithm of Karlin and Altschul, 1990, modified as in Karlin and 
Altschul, 1993. 

Computer implementations of these mathematical algorithms can be utilized for comparison 
of sequences to determine sequence identity. Such implementations include, but are not limited to: 
CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, California); the 
ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group 
(GCG), 575 Science Drive, Madison, Wisconsin, USA). Alignments using these programs can be 
performed using the default parameters. The CLUSTAL program is well described by Higgins et al. 
1988; Higgins et al. 1989; Corpet et al. 1988; Huang et al. 1992; and Pearson et al. 1994. The 
ALIGN program is based on the algorithm of Myers and Miller, supra. The BLAST programs of 
Altschul et al., 1990, are based on the algorithm of Karlin and Altschul supra. 
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Software for performing BLAST analyses is publicly available through the National Center 
for Biotechnology Information (http://www.ncbi.nlrn.nih.gov/). This algorithm involves first identifying 
high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, 
which either match or satisfy some positive-valued threshold score T when aligned with a word of 

5 the same length in a database sequence. T is referred to as the neighborhood word score threshold 
(Altschul et al., 1990). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are then extended in both directions along each 
sequence for as far as the cumulative alignment score can be increased. Cumulative scores are 
calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching 

10 residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in 
each direction are halted when the cumulative alignment score falls off by the quantity X from its 
maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one 
or more negative- scoring residue alignments, or the end of either sequence is reached. 

15 In addition to calculating percent sequence identity, the BLAST algorithm also performs a 

statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993). One 
measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which 
provides an indication of the probability by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to 

20 a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence 
to the reference nucleic acid sequence is less than about 0.1 , more preferably less than about 0.01 , 
and most preferably less than about 0.001. 

To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) 
can be utilized as described in Altschul et al. 1997. Alternatively, PSI-BLAST (in BLAST 2.0) can 

25 be used to perform an iterated search that detects distant relationships between molecules. See 
Altschul et al, supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default 
parameters of the respective programs (e.g. BLASTN for nucleotide sequences, BLASTX for 
proteins) can be used. The BLASTN program (for nucleotide sequences) uses as defaults a 
wordlength (W) of 1 1, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of 
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both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 
3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1989). 
See http://www.ncbi.nJ m.nih.gov. Alignment may also be performed manually by inspection. 

For purposes of the present invention, comparison of nucleotide sequences for determination 

5 of percent sequence identity to the promoter sequences disclosed herein is preferably made using the 
BlastN program (version 1 .4.7 or later) with its default parameters or any equivalent program. By 
"equivalent program" is intended any sequence comparison program that, for any two sequences in 
question, generates an alignment having identical nucleotide or amino acid residue matches and an 
identical percent sequence identity when compared to the corresponding alignment generated by the 

10 preferred program. 

(c) As used herein, "sequence identity" or "identity" in the context of two nucleic acid or 
polypeptide sequences makes reference to the residues in the two sequences that are the same when 
aligned for maximum correspondence over a specified comparison window. When percentage of 
sequence identity is used in reference to proteins it is recognized that residue positions which are not 

15 identical oflen differ by conservative amino acid substitutions, where amino acid residues are 
substituted for other amino acid residues with similar chemical properties (e.g., charge or 
hydrophobicity) and therefore do not change the functional properties of the molecule. When 
sequences differ in conservative substitutions, the percent sequence identity may be adjusted 
upwards to correct for the conservative nature of the substitution. Sequences that differ by such 

20 conservative substitutions are said to have "sequence similarity" or "similarity." Means for making this 
adjustment are well known to those of skill in the art. Typically this involves scoring a conservative 
substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence 
identity. Thus, for example, where an identical amino acid is given a score of 1 and a non- 
conservative substitution is given a score of zero, a conservative substitution is given a score between 

25 zero and 1 . The scoring of conservative substitutions is calculated, e.g., as implemented in the 
program PC/GENE (Intelligenetics, Mountain View, California). 

(d) As used herein, "percentage of sequence identity" means the value determined by 
comparing two optimally aligned sequences over a comparison window, wherein the portion of the 
polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) 
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as compared to the reference sequence (which does not comprise additions or deletions) for optimal 
alignment of the two sequences. The percentage is calculated by determining the number of positions 
at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the 
number of matched positions, dividing the number of matched positions by the total number of 
5 positions in the window of comparison, and multiplying the result by 100 to yield the percentage of 
sequence identity. 

(eXO The term "substantial identity" of polynucleotide sequences means that a polynucleotide 
comprises a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 
79%, preferably at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more 

10 preferably at least 90%, 91%, 92%, 93%, or 94%, and most preferably at least 95%, 96%, 97%, 
98%, or 99% sequence identity, compared to a reference sequence using one of the alignment 
programs described using standard parameters. One of skill in the art will recognize that these values 
can be appropriately adjusted to determine corresponding identity of proteins encoded by two 
nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame 

15 positioning, and the like. Substantial identity of amino acid sequences for these purposes normally 
means sequence identity of at least 70%, more preferably at least 80%, 90%, and most preferably at 
least 95%. 

Another indication that nucleotide sequences are substantially identical is if two molecules 
hybridize to each other under stringent conditions (see below). Generally, stringent conditions are 

20 selected to be about 5°C lower than the thermal melting point (T m ) for the specific sequence at a 
defined ionic strength and pH. However, stringent conditions encompass temperatures in the range 
of about 1°C to about 20°C, depending upon the desired degree of stringency as otherwise qualified 
herein. Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides they encode are substantially identical. This may occur, 

25 e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the 
genetic code. One indication that two nucleic acid sequences are substantially identical is when the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide 
encoded by the second nucleic acid. 



-30- 



WO 03/000905 



PCT/IB02/02450 



(eXii) The term "substantial identity" in the context of a peptide indicates that a peptide 
comprises a sequence with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, 
preferably 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 
90%, 91 %, 92%, 93%, or 94%, or even more preferably, 95%, 96%, 97%, 98% or 99%, 
5 sequence identity to the reference sequence over a specified comparison window. Preferably, 
optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch 
(1970). An indication that two peptide sequences are substantially identical is that one peptide is 
immunologically reactive with antibodies raised against the second peptide. Thus, a peptide is 
substantially identical to a second peptide, for example, where the two peptides differ only by a 

10 conservative substitution. 

For sequence comparison, typically one sequence acts as a reference sequence to which test 
sequences are compared. When using a sequence comparison algorithm, test and reference 
sequences are input into a computer, subsequence coordinates are designated if necessary, and 
sequence algorithm program parameters are designated. The sequence comparison algorithm then 

15 calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, 
based on the designated program parameters. 

As noted above, another indication that two nucleic acid sequences are substantially identical 
is that the two molecules hybridize to each other under stringent conditions. The phrase "hybridizing 
specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular 

20 nucleotide sequence under stringent conditions when that sequence is present in a complex mixture 
(e.g., total cellular) DNA or RNA. "Bind(s) substantially" refers to complementary hybridization 
between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be 
accommodated by reducing the stringency of the hybridization media to achieve the desired detection 
of the target nucleic acid sequence. 

25 "Stringent hybridization conditions" and "stringent hybridization wash conditions" in the 

context of nucleic acid hybridization experiments such as Southern and Northern hybridization are 
sequence dependent, and are different under different environmental parameters. The T m is the 
temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to 
a perfectly matched probe. Specificity is typically the function of post- hybridization washes, the 
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critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA 
hybrids, the T m can be approximated from the equation of Meinkoth and Wahl, 1984; T m 81 .5°C + 
16.6 (log M) +0.41 (%GC) - 0.61 (% form) - 500/L; where M is the molarity of monovalent 
cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the 
5 percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. 
T m is reduced by about 1°C for each 1% of mismatching; thus, T m , hybridization, and/or wash 
conditions can be adjusted to hybridize to sequences of the desired identity. For example, if 
sequences with >90% identity are sought, the T m can be decreased 10°C. Generally, stringent 
conditions are selected to be about 5°C lower than the thermal melting point I for the specific 

10 sequence and its complement at a defined ionic strength and pH. However, severely stringent 
conditions can utilize a hybridization and/or wash at 1 , 2, 3, or 4°C lower than the thermal melting 
point I; moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10°C 
lower than the thermal melting point I; low stringency conditions can utilize a hybridization and/or 
wash at 1 1, 12, 13, 14, 15, or 20°C lower than the thermal melting point I. Using the equation, 

15 hybridization and wash compositions, and desired T, those of ordinary skill will understand that 
variations in the stringency of hybridization and/or wash solutions are inherently described. Jf the 
desired degree of mismatching results in a T of less than 45°C (aqueous solution) or 32°C 
(formamide solution), it is preferred to increase the SSC concentration so that a higher temperature 
can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen, 1993. 

20 Generally, highly stringent hybridization and wash conditions are selected to be about 5°C lower than 
the thermal melting point T m for the specific sequence at a defined ionic strength and pH. 

An example of highly stringent wash conditions is 0.15 M NaCI at 72°C for about 1 5 
minutes. An example of stringent wash conditions is a 0.2X SSC wash at 65°C for 15 minutes (see, 
Sambrook, infra, for a description of SSC buffer). Often, a high stringency wash is preceded by a 

25 low stringency wash to remove background probe signal. An example medium stringency wash for a 
duplex of, e.g., more than 100 nucleotides, is IX SSC at 45°C for 15 minutes. An example low 
stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6X SSC at 40°C for 
1 5 minutes. For short probes (e.g., about 10 to 50 nucleotides), stringent conditions typically 
involve salt concentrations of less than about 1.5 M, more preferably about 0.01 to 1.0 M, Na ion 
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concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 30°C 
and at least about 60°C for long robes (e.g., >50 nucleotides). Stringent conditions may also be 
achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise 
ratio of 2X (or higher) than that observed for an unrelated probe in the particular hybridization assay 

5 indicates detection of a specific hybridization. Nucleic acids that do not hybridize to each other 
under stringent conditions are still substantially identical if the proteins that they encode are 
substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using the maximum 
codon degeneracy permitted by the genetic code. 

Very stringent conditions are selected to be equal to the T m for a particular probe. An 

10 example of stringent conditions for hybridization of complementary nucleic acids which have more 
than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide, e.g., 
hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0. IX SSC at 60 to 
65°C. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% 
formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37°C, and a wash in IX to 2X SSC 

15 (20X SSC = 3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55°C. Exemplary moderate stringency 
conditions include hybridization in 40 to 45% formamide, 1 .0 M NaCl, 1 % SDS at 37°C, and a 
wash in 0.5X to IX SSC at 55 to 60°C. 

The following are examples of sets of hybridization/wash conditions that may be used to clone 
orthologous nucleotide sequences that are substantially identical to reference nucleotide sequences of 

20 the present invention: a reference nucleotide sequence preferably hybridizes to the reference 

nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C 
with washing in 2X SSC, 0.1% SDS at 50°C, more desirably in 7% sodium dodecyl sulfate (SDS), 
0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in IX SSC, 0.1% SDS at 50°C, more desirably 
still in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO<, 1 mM EDTA at 50°C with washing in 

25 0.5X SSC, 0.1 % SDS at 50°C, preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 
mM EDTA at 50°C with washing in 0.1 X SSC, 0.1% SDS at 50°C, more preferably in 7% sodium 
dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1X SSC, 0.1% 
SDS at 65°C. 
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"DNA shuffling" is a method to introduce mutations or rearrangements, preferably randomly, 
in a DNA molecule or to generate exchanges of DNA sequences between two or more DNA 
molecules, preferably randomly. The DNA molecule resulting from DNA shuffling is a shuffled DNA 
molecule that is a non- naturally occurring DNA molecule derived from at least one template DNA 
5 molecule. The shuffled DNA preferably encodes a variant polypeptide modified with respect to the 
polypeptide encoded by the template DNA, and may have an altered biological activity, with respect 
to the polypeptide encoded by the template DNA. 

"Recombinant DNA molecule 1 is a combination of DNA sequences that are joined together 
using recombinant DNA technology and procedures used to join together DNA sequences as 
10 described, for example, in Sambrook et al., 1989. 

The word "plant" refers to any plant, particularly to seed plant, and "plant cell" is a structural 
and physiological unit of the plant, which comprises a cell wall but may also refer to a protoplast. 
The plant cell may be in form of an isolated single cell or a cultured cell, or as a part of higher 
organized unit such as, for example, a plant tissue, or a plant organ. 
15 "Significant increase" is an increase that is larger than the margin of error inherent in the 

measurement technique, preferably an increase by about 2- fold or greater. 

"Significantly less" means that the decrease is larger than the margin of error inherent in the 
measurement technique, preferably a decrease by about 2- fold or greater. 

20 Within the scope of the present invention a set of nucleic acid molecules is provided which comprises 
polynucleotides relating to genes which are shown to be preferentially up- regulated and to share a 
similar expression pattern during the process of grain filling. The polynucleotides within this subgroup 
are useful tools for generating plants which produce grain with modified compositional characteristics 
leading to improved nutritional properties 

25 In one embodiment, the present invention thus relates to an isolated nucleic acid molecule 

comprising a nucleotide sequence encoding a polypeptide the expression of which is up-regulated 
during grain filling and the use of said molecule for modifying the nutritional composition and quality 
of the plant grain. 
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The majority of the polynucleotides within this group encode protein products that are directly 
involved in or associated with three major pathways of nutrition partitioning: the synthesis and 
transport of (1) carbohydrates, (2) proteins, and (3) fatty acids. 

Carbohydrates are the most abundant organic molecules in nature and modulation of their 

5 synthesis, accumulation, and storage presents a vast template of possibilities for improving the quality 
and quantity of agricultural plants, food crops, consumer health products such as dietary 
supplements, and many industrial applications. In plants, carbohydrates occur as mono-, di-, or 
polysaccharides and have the essential functions of providing the plant with chemical energy and 
structural stability. Although sugar uptake from external sources generally is not a relevant process, 

10 the redistribution of sugar (usually glucose) from photosynthesizing tissues to non-green cells is of 
major importance. Once translocated to terminal sink storage tissues, sugars are converted to starch 
and stored in the leucoplasts of seeds, fruits, tubers and roots, as well as actively growing 
photosynthetic tissues. These plant tissues provide the bulk of human dietary intake, and as such, the 
anabolic pathways of synthesis and assimilation (starch, fatty acids, and nitrogen) are of particular 

15 importance to agriculture and commercial industry. 

As major contributors to the global carbon cycle, plants and algae bind 100 billion metric 
tons of carbon into carbohydrates each year. Nucleotide sequences encoding at least one 
polypeptide involved in sugar and carbohydrate metabolism and their end products, as well as the 
polypeptides encoded thereby, or an antigene sequences thereof, are commercially useful materials 

20 that can be used to study these processes and to modify these processes to elicit desired 
modifications in the compositional and nutritional characteristics of the plant grain. 

In particular, the subset of nucleic acid molecules provided herein, which comprises 
polynucleotides relating to genes that are up -regulated during grain filling and involved in 
carbohydrate transport, synthesis, metabolism, or degradation is a valuable tool box from which an 

25 appropriate nucleic acid molecule can be chosen for modifying the quantity and quality of the 

carbohydrate and sugar content of the grain, respectively. This can be achieved by introducing and 
overexpressing at least one polynucleotide from the various subsets of nucleic acid molecules 
provided herein in the plant, but preferentially in the approproate tissues of the plant grain such as, 



-35- 



WO 03/000905 



PCT/IB02/02450 



for example, the plant endosperm or by reducing the expression level of the corresponding 
endogenous gene by methods known in the art including anti- sense and dsRNAi techniques. 

It is thus one of the major objectives of the present invention to identify and provide a subset 
of nucleic acid molecules comprising at least one polynucleotide which encodes a protein that is 
involved in the metabolism of carbohydrates during grain filling. By modifying the expression level of 
at least one of the polynucleotides from this subgroup in a plant, but preferably in the approproate 
tissues of the plant grain such as, for example, the plant endosperm, and even more preferably at an 
early stage in seed development, it is possible to modify the carbohydrate composition of the plant 
grain accordingly. 

In one embodiment, the invention thus relates to a polynucleotide comprising a nucleotide 
sequence encoding a polypeptide the activity of which is involved in or associated with the synthesis, 
metabolism or degradation of carbohydrates in the plant grain and the expression of which is up- 
regulated during grain filling, which nucleotide sequence is substantially similar to a sequence 
encoding a polypeptide as given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 70 - 2 1 0. 

In particular, the invention relates to polynucleotide comprising a nucleotide sequence 
encoding a polypeptide the activity of which is involved in or associated with the synthesis, 
metabolism or degradation of carbohydrates in the plant grain and the expression of which is up- 
regulated during grain filling, and which is substantially similar, and preferably has at least between 
70%, and 99% amino acid sequence identity to at least one polypeptide of SEQ ID NOs given in 
table 7 such as SEQ ID NOs: 70 - 2 1 0, with any individual number within this range of between 
70% and 99% also being part of the invention. 

The invention further relates to polynucleotide comprising a nucleotide sequence encoding a 
polypeptide the activity of which is involved in or associated with the synthesis, metabolism or 
degradation of carbohydrates in the plant grain and the expression of which is up-regulated during 
grain filling, and which is immunologically reactive with antibodies raised against a polypeptide as 
given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 70-210. 

More particularly, the invention relates to polynucleotide comprising a nucleotide sequence 
a) as given in any one of SEQ ID NOs of table 7 such as SEQ ID NOs: 69 - 209 
or a part thereof which still encodes a partial-length polypeptide having 
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substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full- length polypeptide.; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NOs of table 
7 such as SEQ ID NOs: 69 - 209 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

One of the defining questions in assimilate partitioning is understanding how plants regulate 
the allocation of photosynthate between competing sink organs. In addition to the number of 
competing organs, and the sink strength of each, exogenous factors such as abiotic stress or 
pathogen infection may also influence partitioning (Bush, Current Opinions in Plant Biology 2: 187. 
(1999)). 

Within the present invention a subset of genes could be identified that are known to be 
involved in the plant's response to abiotic and/or biotic stresses and demonstrated to be up- 
regulated during grain filling. By providing these genes it is now possible to regulate the expression 
levels of the encoded protein products in the plant grain during the grain filling process by applying 
methods known in the art including overexpressing or down- regulating the nucleic acid molecule in a 
plant, or preferably a plant seed, thereby modifying the partitioning in the developing grain. 

In one aspect, the present invention relates to polynucleotide comprising a nucleotide 
sequence encoding a polypeptide the expression of which is up-regulated during grain filling and the 
activity of which is involved in or associated with the plant's response to abiotic and/or biotic 
stresses, which nucleotide sequence is substantially similar to a sequencen encoding a polypeptide as 
given in any one of the SEQ ID NOs of table 4 such as SEQ ID NOs: 2- 1 8. 

In particular, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide the expression of which is up-regulated during grain filling and the activity of 
which is involved in or associated with the plant's response to abiotic and/or biotic stresses, and 
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which is substantially similar, and preferably has at least between 70%, and 99% amino acid 
sequence identity to at least one polypeptide as given in any one of the SEQ ID NOs of table 4 such 
as SEQ ID NOs: 2-18, with any individual number within this range of between 70% and 99% also 
being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide the expression of which is up- regulated during grain filling and the activity of which is 
involved in or associated with the plant's response to abiotic and/or biotic stresses, and which is 
immunologically reactive with antibodies raised against a polypeptide as given in any one of the SEQ 
ID NOs of table 4 such as SEQ ID NOs: 2-18. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in in any one of the SEQ ID NOs of table 4 such as SEQ ID NOs: 1 - 
1 7 or a part thereof which still encodes a partial-length polypeptide having 
substantially the same activity as the. fall- length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the fall- length polypeptide.; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the 
SEQ ID NOs of table 4 such as SEQ ID NOsl - 1 7 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

The regulation of source- sink pathways encompasses complex mechanisms that integrate the 
expression of enzymes involved in carbohydrate production in source tissue with those involved with 
utilization in sink tissue. The elucidation of the underlying signal transduction pathways of sink-source 
regulation is of critical importance to the genetic manipulation of source- sink relations in transgenic 
plants. 
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Within the scope of the present invention a subset of genes was identified comprising genes 
that are up-regulated during grain filling and encode polypeptides with a kinase or phosphatase 
activity which are known to be involved in signal transduction pathways. 

In a specific embodiment, the present invention provides nucleic acid molecules such as 
5 those represented in SEQ ID NOs: 19-29 that encode enzymes which exhibit a kinase or 
phosphatase activity and/or are involved in a signalig pathway and are thus key to the ability of 
regulating utilization of carbon/sugar sources, and partitioning of assimilates between source and sink 
tissues. 

The invention thus relates to a polynucleotide comprising a nucleotide sequence encoding a 
10 polypeptide which exhibits a kinase or phosphatase activity and/or are involved in a signal 

transduction pathway, the expression of which is up- regulated during grain filling, which nucleotide 
sequence is substantially similar to a sequence encoding a polypeptide as given in any one of the 
SEQ ID NOs of table 5 such as SEQ ID Nos: 20 - 30. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
15 encoding a polypeptide which exhibit a kinase or phosphatase activity and is up-regulated during 
grain filling and has at least between 70%, and 99% amino acid sequence identity to at least one 
polypeptide as given in any one of the SEQ ID NOs of table 5 such as SEQ ID NOs: 20 - 30, with 
any individual number within this range of between 70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
20 a polypeptide which exhibit a kinase or phosphatase activity and is up-regulated during grain filling 
and immunologically reactive with antibodies raised against a polypeptide as given in any one of the 
SEQ ID NOs of table 5 such as SEQ ID NOs: 20 - 30. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 5 such as SEQ ID NOs: 1 9 - 
25 29 or a part thereof which still encodes a partial- length polypeptide having 

substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide.; 

b) having substantial similarity to (a); 
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c) capable of hybridizing to (a) or the complement thereof, 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
ID NOs of table 5 such as SEQ ID NOs: 1 9 - 29 or the complement thereof; 

5 e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Regulating the environment- induced carbon status in crop plants, particularly the partitioning 
in storage organs, provides industry with the ability to limit or expand growing seasons to better suit 
commercial markets, to enhance the quality and content of food products derived from storage 
10 organs or other tissue specific components of crop plants, and modulate many other metabolic 
pathways in plants (such as nitrogen assimilation, phosphorylation and the activation of regulatory 
proteins) that effect consumer end use. 

Another possibility for modifying the carbohydrate content of the grain is through regulation 
of the transport of sugars and carbohydrates during grain filling. 
15 Supplying carbohydrates to sink tissues via apoplastic mechanisms involves the release of 

sucrose into the apoplast by an exporter, cleavage by an extracellular invertase, and uptake of 
hexose monomers by monosaccharide transporters. 

In one specific embodiment the present invention thus relates to a polynucleotide comprising 
a nucleotide sequence encoding a polypeptide with an activity which is involved in or associated with 
20 sugar transport and up-regulated during grain filling, which nucleotide sequence is substantially similar 
to a sequence encoding a polypeptide as given in any one of the SEQ ID NOs of table 6 such as 
SEQ ID NOs: 36; 50, and 58. 

In particular, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide with an activity which is involved in or associated with sugar transport and 
25 up-regulated during grain filling and is substantially similar, and preferably has at least between 70%, 
and 99% amino acid sequence identity to at least one polypeptide as given in any one of the SEQ ID 
NOs of table 6 such as SEQ ID NOs: 36; 50, and 58., with any individual number within this range 
of between 70% and 99% also being part of the invention. 
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The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide with an activity which is involved in or associated with sugar transport and up- 
regulated during grain filling and is immunologically reactive with antibodies raised against a 
polypeptide as given in any one of the SEQ ID NOs of table 6 such as SEQ ID NOs: 36; 50, and 
58.. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 6 such as SEQ ID NOs: 35; 
49, and 57 or a part thereof which still encodes a partial- length polypeptide 
having substantially the same activity as the full-length polypeptide, e.g., at least 
50%, more preferably at least 80%, even more preferably at least 90% to 95% 
the activity of the full-length polypeptide.; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
ID NOs of table 6 such as SEQ ID NOs: 35; 49, and 57or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Transmembrane transport of sugars has been demonstrated by the presence of transporter 
genes for a few crop species (spinach, potato). For the uses and application of modifying sugar 
transport mechanisms with regard to controlling the timing and extent of grain fill durations, we 
incorporate all relevant sections of PCT Publication WO9953068 to Allen et al. y and for uses and 
application of modifying cells or plastids involved in hexose carrier proteins we incorporate all 
relevant sections of PCT Publication WO9953082 to Allen et at. 

Glucosyl equivalents for starch biosynthesis are found within the scope of the present 
invention to be transported into the plastid (amyloplast) either as glucose- 1 -phosphate via a hexose- 
phosphate-Pi transporter (a representative example of which is given in SEQ ID NO: 35), as triose 
phosphates via a triose-phosphate-Pi translocator (a representative example of which are given in 
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SEQ ID NO: 163), as phosphoenolpyruvate via a PEP-Pi translocator (SEQ ID NOs: 175), or as 
ADP-glucose via a Brittle-Y\ke adenylate translocator or via an oxoglutarate/malatc transporter. One 
isoform of a triose-phosphate/phosphate translocator (SEQ ID NO: 163) is expressed to a slightly 
higher level during earlier stages of grain development. 

5 Pyruvate appears to play a more important role during early stages of grain development in 

that a gene encoding an isoform of a PEP-Pi translocator (SEQ ID NO: 1 75) is relatively more 
highly expressed at this stage. In maize endospemi, the majority of glucosyl moieties are transported 
to the amyloplast during the linear phase of starch accumulation as ADP-glucose (J.C. Shannon et 
al., Plant Physiol. 117, 1235(1998)). 

10 For uses and application of modifying amyloplasts in the regulation of starch production via 

an ADP glucose transporter, we incorporate all relevant sections of PCT Publication W09947681 
to Ernes et al. 

Further examples of genes encoding a sugar transporter are provided in SEQ ID NOs: 35; 
49, and 57. By providing the nucleic acid molecules according to the invention encoding sugar 
15 transporters the expression of which is upregulated during grain filling such as those given in SEQ ID 
NOs: 36; 50, and 58; 36385;; 53483; . it is now possible to manipulate the translocation and 
storage of sugars and their carbohydrate end products in the plant grain. 

In still another embodiment the present invention provides further subset of nucleic acid 
molecules which are up- regulated during grain filling comprising a nucleotide sequence encoding a 
20 polypeptide that has a transmembrane domain and assists in the transport of amino acids and 

inorganic compounds including nitrate and various cations, which nucleotide sequence is substantially 
similar to a sequence encoding a polypeptide as given in SEQ ID NOs: 32; 38; 40; 42; 44; 46; 48; 
52; 54; 56; 60; 62; 64, 66; and 68 . 

In particular, the invention relates to a polynucleotide comprising a nucleotide sequence 
25 encoding a polypeptide, that has a transmembrane domain and assists in the transport of amino acids 
and inorganic compounds including nitrate and various cations and is up-regulated during grain filling 
and is substantially similar, and preferably has at least between 70%, and 99% amino acid sequence 
identity to at least one polypeptide of SEQ ID NOs: 32; 38; 40; 42; 44; 46; 48; 52; 54; 56; 60; 62; 
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64, 66; and 68 ., with any individual number within this range of between 70% and 99% also being 
part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide, that has a transmembrane domain and assists in the transport of amino acids and 
inorganic compounds including nitrate and various cations and is up-regulated during grain filling and 
is immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 32; 38; 40; 
42; 44; 46; 48; 52; 54; 56; 60; 62; 64, 66; and 68 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 3 1 ; 37; 39; 41 ; 43; 45; 47; 5 1 ; 53; 55; 59; 
61 2; 63, 65; and 67 . or a part thereof which still encodes a partial- length 
polypeptide having substantially the same activity as the full-length polypeptide, 
e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NO: 31; 37; 
39; 41; 43; 45; 47; 51; 53; 55; 59; 612; 63, 65; and 67, or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

In particular, the invention provides a nucleic acid molecule which is up-regulated during 
grain filling and comprises a nucleotide sequence encoding a polypeptide that belongs to the POT or 
PTR family. 

Proteins of the POT family (also called the PTR (peptide transport) family) consists of proteins 
from animals, plants, yeast, archaea, and both Gram-negative and Gram- positive bacteria. Several of 
these organisms possess multiple POT family paralogues. The proteins are of about 450-600 amino 
acyl residues in length with the eukaryotic proteins in general being longer than the bacterial proteins. 
They exhibit 12 putative or established transmembrane ? -helical spanners. Some members of the 
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POT family exhibit limited sequence similarity to protein members of the major facilitator superfamily 
(MFS; TC #2.A.l). (Comparison scores of up to 8 standard deviations for segments in excess of 60 
residues in length.) Thus the POT family is probably a family within the MFS. 

While most members of the POT family catalyze peptide transport, one is a nitrate permease 

5 and one can transport histidine as well as peptides. Some of the peptide transporters can also 

transport antibiotics. They function by proton symport, but the substrate:H + stoichiometry is variable: 
the high affinity rat PepT2 carrier catalyzes uptake of 2 and 3H* with neutral and anionic dipeptides, 
respectively, while the low affinity PepTl carrier catalyzes uptake of one H*" per neutral peptide. In 
eukaryotes, some of these transporters may be in organellar membranes such as the lysosomes. 

10 The generalized transport reaction catalyzed by the proteins of the POT family is: 

substrate (out) + nrT (out) — > substrate (in) + nrT (in). 

In a specific embodiment, the present invention relates to an isolated nucleic acid molecule 
which is up-regulated during grain filling and comprises a nucleotide sequence encoding a 
polypeptide that belongs to the POT or PTR family, which nucleotide sequence is substantially 
15 similar to a sequence encoding a polypeptide as given in SEQ ID NOs: 38; 52, and 68. 

In particular, the invention relates to an isolated nucleic acid molecule comprising a 
nucleotide sequence encoding a polypeptide, which belongs to the POT or PTR family and up- 
regulated during grain filling and is substantially similar, and preferably has at least between 70%, and 
99% amino acid sequence identity to at least one polypeptide of SEQ ID NOs: 38; 52, and 68, with 
20 any individual number within this range of between 70% and 99% also being part of the invention. 

The invention further relates to an isolated nucleic acid molecule comprising a nucleotide 
sequence encoding a polypeptide, which belongs to the POT or PTR family and up-regulated during 
grain filling and is immunologically reactive with antibodies raised against a polypeptide of SEQ ID 
NOs: 38; 52, and 68. 

25 More particularly, the invention relates to an isolated nucleic acid molecule comprising a 

nucleotide sequence 

a) as given in any one of SEQ ID NOs: 37; 5 1 , and 67 or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
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fall-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NO: 37; 5 1 , 
and 67 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

One of the economically most important and valuable carbohydrate end products is starch, 
which is an essential component of many food, feed, and industrial products. It consists of two types 
of glucan polymers: relatively long chained polymers with few branches known as arnylose, and 
shorter chained but highly branched molecules called amylopectin. 

Its biosynthesis depends on the complex interaction of multiple enzymes (Smith, A. et al., 
(1995) Plant Physio. 107:673-677; Preiss, J., (3988) Biochemistry of Plants 14:181-253). One of 
the key enzymes in starch biosynthesis is ADP-glucose pyrophosphorylase, which catalyzes the 
formation of ADP-glucose; a series of starch synthases which use ADP glucose as a substrate for 
polymer formation using .alpha.- 1 -4 linkages; and several starch branching enzymes, which modify 
the polymer by transferring segments of polymer to other parts of the polymer using .alpha.- 1 -6 
linkages, creating branched structures. However, based on data from starch forming plants such as 
potato, and corn, it is becoming clear that other enzymes also play a role in the determination of the 
final structure of starch. In particular, debranching and disproportionating enzymes not only 
participate in starch degradation, but also in modification of starch structure during its biosynthesis. 
Different models for this action have been proposed, but all share the concept that such activities, or 
lack thereof, change the structure of the starch produced. 

In plants used typically for the production of starch, such as maize or potato, the synthesized 
starch consists of approximately 25% amylose-sto/r/? and of about 75% amylopectin-s/arc/?. 
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With respect to the homogeneity of the basic component starch for its use in the industrial 
area, starch- producing plants are needed which contain, for example, only the component 
amylopectin or only the component amylose. For a number of other uses plants are needed that 
synthesize amylopectin types with different degrees of branchings. 
5 Such plants may for example be obtained by breeding or by means of mutagenesis techniques, 

it is known for various plant species, such as for maize, that by means of mutagenesis varieties may 
be produced in which only amylopectin is formed. Also in the case of potato a genotype was 
produced from a haploid line by means of chemical mutagenesis. Said genotype does not form 
amylose (Hovenkamp-Hermelink, Theor. Appl. Genet. 75 (1987), 217-221). 

10 Apart from conventional breeding and mutagenesis techniques, recombinant DNA 

techniques are now increasingly used in order to specifically interfere with the starch metabolism of 
starch storing plants. A prerequisite for this is that DNA sequences be provided which encode 
enzymes involved in the starch metabolism. 

The present invention now provides a subset of nucleic acid molecules that are involved in 

15 the starch biosynthesis pathway and were shown to be up-regulated during grain filling. 

Representative examples of those subset genes are provided in SEQ ID NOs: 69 - 1 87 of the 
Sequence Listing. 

In a particular embodiment, the present invention relates to a polynucleotide comprising a 
nucleotide sequence encoding a polypeptide which is involved in associated with starch biosynthsis 
20 and up-regulated during grain filling, which nucleic acid molecule is substantially similar to a nucleic 
acid encoding a polypeptide as given in any one of the SEQ ID NOs of table 7 such as SEQ ID 
NOs: 70- 188. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide, which is involved in or associated with starch biosynthesis and up-regulated 
25 during grain filling and is substantially similar, and preferably has at least between 70%, and 99% 
amino acid sequence identity to at least one polypeptide as given in any one of the SEQ ID NOs of 
table 7 such as SEQ ID NOs: 70 - 1 88, with any individual number within this range of between 
70% and 99% also being part of the invention. 
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The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide, which is involved in or associated with starch biosynthesis and up-regulated during 
grain filling and is immunologically reactive with antibodies raised against a polypeptide as given in 
any one of the SEQ ID NOs of table 7 such as SEQ ID NOs: 70 - 1 88. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 7 such as SEQ ID NOs: 69- 
1 87or a part thereof which still encodes a partial- length polypeptide having 
substantially the same activity as the fiill-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
ID NOs of table 7 such as SEQ ID NOs: 69 - 1 87, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

By providing a subset of genes encoding polypeptides that are involved in starch metabolism 
it is now possible to interfere with starch metabolism to produce starch with modified 
physico/chemical characteristics. 

A gene encoding the small subunit of ADPG pyrophosphorylase (SEQ ID NO: 138); is 
expressed at early stages of grain development in conjunction with a single gene encoding a large 
subunit (SEQ ID NO: 140). Three other large subunits (SEQ ID NOs: 136; 142); are up-regulated 
at a later stage in development from 4 days after anthesis, in conjunction with the up regulation of the 
starch synthase genes (SEQ ID NOs: 129; 1 31 ; and 1 33) and two genes for branching enzymes 
(SEQ ID NOs: 70; and 72) (involved in amylose and amylopectin biosynthesis, respectively). Only 
one (distinct from the two mentioned above) of the small subunit genes increases in this time period. 
The expression of different isoforms may be related to the shift to storage starch production and a 
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postulated concomitant shift to cytoplasmic ADP-glucose production (Stark, D.M., et a!., 
"Regulation of the Amount of Starch in Plant Tissues by ADP Glucose Pyrophosphorylase", 
Science, 258, 287-291 (Oct. 9, 1992)). 

In one embodiment the present invention provides a nucleic acid molecule comprising a 
5 nucleotide sequence which encodes a small subunit of ADPG pyrophosphorylase. In another 

embodiment the invention provides a nucleic acid molecule comprising a nucleotide sequence which 
encodes a large subunit of ADPG pyrophosphorylase. 

In particular, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide with an activity of a small and large subunit ADPG pyrophosphorylase, 
10 respectively, which nucleotide sequence is substantially similar to a nucleic acid sequence encoding a 
polypeptide as given in SEQ ID NOs: 136 - 142. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide with an activity of a small and large subunit ADPG pyrophosphorylase, 
respectively, which is up-regulated during grain filling and has at least between 70%, and 99% amino 
1 5 acid sequence identity to at least one polypeptide of SEQ ID NOs: 1 36 - 1 42, with any individual 
number within this range of between 70% and 99% also being part of the invention. 

The invention farther relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide with an activity of a small and large subunit ADPG pyrophosphorylase, respectively, 
which is up- regulated during grain and immunologically reactive with antibodies raised against a 
20 polypeptide of SEQ ID NOs: 1 36 - 1 42. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: SEQ ID NOs: 135 - 141 or a part thereof 
which still encodes a partial- length polypeptide having substantially the same 
activity as the full-length polypeptide, e.g., at least 50%, more preferably at least 

25 80%, even more preferably at least 90% to 95% the activity of the full- length 

polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 
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d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of nucleotides given in SEQ ID NO: SEQ ID NOs: 135 
- 141 , or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

The nucleic acid molecules of the instant invention may be used to create transgenic plants in 
which the small and/or large subunits of ADPG pyrophosphorylase are present at higher or lower 
levels than normal or in cell types or developmental stages in which it is not normally found. This may 
have the effect of altering starch structure in those cells or tissues but especially in the developing 
grain. 

For a farther targeted modification of the starch in plants, in particular of the degree of 
branching of starch synthesized in plants by means of recombinant DNA techniques, it is still 
necessary to identify DNA sequences that encode enzymes participating in the starch metabolism, 
particularly in the branching of starch molecules. 

In the case of potato, for example, DNA sequences have by now been described which 
encode a granule-bound starch synthase or a branching enzyme (Q enzyme), and they have been 
used in order to genetically modify plants. 

Apart from the Q enzymes that introduce branchings into starch molecules, enzymes occur in 
plants which are capable of dissolving branchings. These enzymes are called debranching enzymes. 

In the case of sugar beet, Li et al. (Plant Physiol. 98 (1992), 1277-1284) could only prove the 
occurrence of one debranching enzyme, apart from five en do- and two exoamylases. This enzyme 
having a size of approximately 100 kD and an optimum pH value of 5.5 is located within the 
chloroplasts. A debranching enzyme was also described for spinach. The debranching enzyme from 
spinach as well as that from sugar beet exhibit a fivefold lower activity in a reaction with amylopectin 
as substrate when compared to a reaction with pullulan as a substrate (Ludwig et al., Plant Physiol. 
74 (1984), 856-861; Li et al., Plant Physiol. 98 (1992), 1277-1284). The isolation of a cDNA 
encoding a debranching enzyme was described for spinach (Renz et al. , Plant Physiol. 1 08 ( 1 995), 
1342). 
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The existence of a debranching enzyme for maize has been described in the prior art. The 
corresponding mutant was designated su (sugary). The gene of the sugary locus was cloned recently 
(see James et al., Plant Cell 7 ( 1 995), 4 1 7-429). In the case of the agriculturally significant starch- 
storing cultured plant potato, the activity of a debranching enzyme was examined by Hobson et al. (J. 

5 Chem. Soc., (1 95 1), 1 45 1). It was proven that the respective enzyme, contrary to the Q enzyme, 
does not exhibit any activities leading to an elongation of the polysaccharide chain, but merely 
hydrolyses .alpha.-l,6-gIycosidic bonds. 

Within the scope of the present invention a subset of genes is provided that encode 
polypeptides the activity of which is associated with the structural shaping of the starch granule. In 

10 particular, the invention provides a subset of genes that encode polypeptides the activity of which is 
associated the branching/debranching (representative examples of wich are given in SEQ ID NOs: 
69 - 73 / 75; 77 (isoamylase debranching enzyme)) and/or degradation of starch (a-amylase (SEQ 
ID NO: 79 - 91), pullulanase (SEQ ID NO: 109 ) [the last gene in the a-amylase series], a-amylase 
inhibitor (SEQ ID NOs: 93 - 99); 6-amylase (SEQ IDNO101 - 107;), a-glucosidase(SEQID 

15 NO: 1 1 1 - - 117). By modulating the expression of the polypeptides according to the invention, the 
amyloseamylopectin ratio can be changed in order to accommodate the varying quality standards for 
food and/or feed applications or specific processing requirements. For example, by over- expressing 
and inhibiting the expression of endogeneous branching and/or debranching enzyme genes in rice or 
any other cereal crop plant, respectively, a plant can be produced that exhibits increased or reduced 

20 amounts of branching/debranching enzyme activity for the purpose of modifying the degree of 
branching of the amylopectin starch. 

By inhibiting the expression of endogeneous branching and/or debranching enzyme genes, 
plants are produced that exhibit a reduced activity of these enzymes, which leads to the synthesis of a 
modified starch. Inhibition of branching/debranching gene expression can be achieved by applying 

25 method known in the art such as, for example, anti- sense or dsRNAi techniques. By applying these 
techniques it is possible to produce plants in which the expression of an endogeneous 
branching/debranching enzyme gene in rice or any other cereal crop plant is inhibited to different 
degrees within the range of 0.1% to 100%, which all individual numbers within this range also being 
part of the invention. This enables in particular the production of cereal plants synthesizing 
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amylopectin starch with most various variations of the degree of branching. This constitutes an 
advantage with regard to conventional breeding and mutagenesis techniques in which a lot of time 
and costs are required in order to provide such a variety. Highly branched amylopectin has a 
particularly large surface and is therefore particularly suitable as a copolymer. A high degree of 
5 branching furthermore leads to an improvement of the amylopectin's solubility in water. This property 
is very advantageous for certain technical applications. 

Another way of modifying the branching characteristics of starch is by overexpressing the 
nucleic acid molecule according to the invention encoding a branching/debranching enzyme activity in 
rice in a transgenic plant, but especially a plant seed. 
10 The expression of a novel or additional branching/debranching enzyme activity from rice 

in the transgenic plant cells and plants of the invention influences the degree of branching of the 
amylopectin synthesized in the cells and plants. Therefore, a starch synthesized in these plants 
exhibits modified physical and/or chemical properties when compared to starch from wildtype plants. 

Genes encoding products involved in starch structure rearrangement (debranching enzyme 
15 (SEQ ID NO: 75 - 77 (isoamylase debranching enzyme)); branching enzyme (SEQ ID NOs: 69 - 
73)) and starch degradation (a-amylase (SEQ ID NOs 79 - 91), a-amylase inhibitor (SEQ ID NOs: 
93 - 99); pullulanase (SEQ ID NOs 109) [the last gene in the a-amylase series], B-amylase (SEQ 
ID NOs 101 - 107), a-glucosidase (SEQ ID NOs 1 1 1- - 1 17)) are all strongly expressed towards 
the end of grain development, reflecting their involvement in the final stages of shaping the starch 
20 granule. Genes encoding isoforms of an a-amylase inhibitor (SEQ ID NOs: 93 and 95) are 
expressed most strongly in the aleurone and seed coat layers, and endosperm and not (or to a 
reduced extent) in the embryo. The embryo also shows a different expression of genes encoding 
starch synthase and branching enzymes, perhaps reflecting its status as an energy-requiring sink 
organ rather than as a storage tissue. Myers et al. discuss the interaction of starch synthases, 
25 branching enzymes, debranching enzymes and disproportionating enzymes in producing and trimming 
glucan molecules so that a final transition may take place to a crystalline form (A.M. Myers, MX. 
Morell, M.G. James, S.G. Ball. Plant Physiol. 122, 989 (2000)). 

In a further embodiment, the present invention provides the ability to modulate the shape and 
the physico/chemical properties of the starch granule by modifying expression level and pattern of 
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those genes that encode products involved in starch structure rearrangement such as, for 
examp1e,SEQ ID NO: 75 - 77 (isoamylase debranching enzyme); branching enzyme (SEQ ID NOs: 
69 - 73) and starch degradation (a-amylase (SEQ JD NOs 79 - 91)), a-amylase inhibitor (SEQ ID 
NOs: 93 - 99); pullulanase (SEQ ID NO: 109), B-amylase (SEQ ID NO: 101 - 107), and/or a- 
5 glucosidase (SEQ ID NO: 1 1 1 - - 1 1 7). 

The invention thus also relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide involved in starch structure rearrangement, which nucleic acid molecule is 
substantially similar to a nucleic acid encoding a polypeptide as given in the SEQ ID NOs of table 7 
such as SEQ ID NOs: 75 - 77 exhibiting isoamylase debranching enzyme activity; 69 - 73 

10 exhibiting a branching enzyme activity, 80 - 92 exhibiting an a-amylase activity; 94-100 exhibiting 
an a-amylase inhibitor activity; 1 10 exhibiting a pullulanase activity; 102 - 108, exhibiting a B- 
amylase activity; 1 1 2- - 118, exhibiting a a-ghicosidase activity. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide which is involved in starch structure rearrangement and up-regulated during 

15 grain filling and has at least between 70%, and 99% amino acid sequence identity to at least one 
polypeptide as given in the SEQ ID NOs of table 7 such as SEQ ID NOs: : 75-77 exhibiting 
isoamylase debranching enzyme activity, 69 - 73 exhibiting a branching enzyme activity, 80 - 92, 80 

- 92 exhibiting an a-amylase activity; 94-100 exhibiting an a-amylase inhibitor activity; 1 10 
exhibiting a pullulanase activity; 102 - 108, exhibiting a B-amylase activity; 1 12- - 118, exhibiting a 

20 a -glucosidase activity, with any individual number within this range of between 70% and 99% also 

being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 

a polypeptide which is involved in starch structure rearrangement and up-regulated during grain filling 

and immunologically reactive with antibodies raised against a polypeptide as given in the SEQ ID 
25 NOs of table 7 such as SEQ ID NOs: : 75-77 exhibiting isoamylase debranching enzyme activity; 

69 - 73 exhibiting a branching enzyme activity, 80 - 92, 80 - 92 exhibiting an a-amylase activity; 94 

- 100 exhibiting an a-amylase inhibitor activity; 1 10 exhibiting a pullulanase activity; 102 - 1 08, 
exhibiting a B-amylase activity; 1 12- - 1 18, exhibiting a a-glucosidase activity. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
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a) as given in the SEQ ID NOs of table 7 such as SEQ ID NOs: : 75-77 
exhibiting isoamylase debranching enzyme activity; 69 - 73 exhibiting a 
branching enzyme activity, 79-91 exhibiting an a-amylase activity; 93 - 99 
exhibiting an a-amylase inhibitor activity, 109 exhibiting a pulhilanase activity; 
101 - 107, exhibiting a B- amylase activity; 111- - 117, exhibiting a a- 
glucosidase activity or a part thereof which still encodes a partial-length 
polypeptide having substantially the same activity as the full-length polypeptide, 
e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given as given in the SEQ ID 
NOs of table 7 such as SEQ ID NOs: 75 - 77 exhibiting isoamylase 
debranching enzyme activity, 69 - 73 exhibiting a branching enzyme activity, 79 
- 91 exhibiting an a-amylase activity; 93-99 exhibiting an a-amylase inhibitor 
activity; 109 exhibiting a pulhilanase activity; 101-107, exhibiting a ft- amylase 
activity; 1 1 1- - 1 17, exhibiting a a-glucosidase activity, or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

The identification of a defined subset of genes that are involved in carbohydrate metabolism 
but especially in starch metabolism and the expression of which is coordinately up- or down- 
regulated during the grain fillig process makes it now possible to improve grain quality by 
overexpressing and/or underexpressing or completely knocking out genes that are known to 
positively contribute to the nutritional or processing properties of grains such as, for example, genes 
encoding products involved in starch structure rearrangement and starch degradation as mentioned 
hereinbefore. 
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The expression of a -amylase, which is central in the starch biosynthesis pathway, may further 
be modified to obtain plants producing a desirable content of reducing sugars. For, example, a high 
content of reducing sugar resulting from a high a-amylase activity is desirable when rice or other 
cereal plants are to be used for the production of alcohol. This can be achieved by modifying the 
expression of the plant endogenous genes encoding an a-amylase or a-amylase inhibitor activity, for 
example, by introducing and overexpressing in a target plant a nucleic acid molecule comprising a 
nucleotide sequence that encodes a polypeptide the amino acid sequence of which is substantially 
similar to any one of those given in SEQ ID NOs: 80 - 92 exhibiting an a-amylase activity; and 94 - 
lOOexhibiting an a-amylase inhibitor activity. 

In the specific embodiment, the invention thus also relates to a polynucleotide comprising a 
nucleotide sequence encoding a polypeptide exhibiting an amylase or an amylase inhibitor activity, 
which nucleic acid molecule is substantially similar to a nucleic acid encoding a polypeptide as given 
in the SEQ ID NOs of table 7 such as SEQ ID NOs: 80 - 92 exhibiting an a-amylase activity; and 
94- 100 exhibiting an a-amylase inhibitor activity. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide which has an activity of an amylase and is up-regulated during grain filling 
and has at least between 70%, and 99% amino acid sequence identity to at least one polypeptide as 
given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 80 - 92 exhibiting an a-amylase activity; 
and 94-100 exhibiting an a-amylase inhibitor activity, with any individual number within this range 
of between 70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide which which has an activity of an amylase and is up- regulated during grain filling and 
immunologically reactive with antibodies raised against a polypeptide as given in the SEQ ID NOs of 
table 7 such as SEQ ID NOs: 80 - 92 exhibiting an a-amylase activity; and 94- 100 exhibiting an 
a-amylase inhibitor activity. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
a) as given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 79-91 exhibiting 
an a-amylase activity; and 93 - 99 exhibiting an a-amylase inhibitor activity or a 
part thereof which still encodes a partial- length polypeptide having substantially 
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the same activity as the full-length polypeptide, e.g., at least 50%, more 
preferably at least 80%, even more preferably at least 90% to 95% the activity 
of the full-length polypeptide; 

b) having substantial similarity to (a); 
5 c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 

consecutive nucleotides of a nucleotide sequence as given in the SEQ ID NOs of 
table 7 such as SEQ ID NOs: 79 - 91 exhibiting an a-amylase activity; and 93 - 
99 exhibiting an a-amylase inhibitor activity or the complement thereof; 
1 0 e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Different isoforms often show distinct spatial expression patterns. For example, three 
different sucrose synthase isoforms (SEQ ID NOs: 1 19 - 123) are expressed in developing grain 
tissue, two of which (SEQ ID NOs: 121 and 123) are expressed more highly at the start of grain 
15 development (0 days post anthesis) and one (SEQ ID NO: 1 1 9) which is up-regulated towards the 
end of grain development. The spatial distribution of each differs. Other isoforms (SEQ ID NOs: 
125. and 127), showing low expression in the grain, are expressed strongly in stems or roots. 

The invention thus also relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide exhibiting a sucrose synthase activity, which nucleic acid molecule is 
20 substantially similar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 120 - 128. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide which has an activity of an sucrose synthase and is up-regulated during grain 
filling and has at least between 70%, and 99% amino acid sequence identity to at least one 
polypeptide of SEQ ID NOs: 1 20 - 1 28, with any individual number within this range of between 
25 70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide which which has an activity of a sucrose synthase and is up- regulated during grain 
filling and immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 
120- 128. 
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More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 1 1 9 - 1 27 or a part thereof which still 
encodes a partiaHength polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of anucleotide sequence given in SEQ ID NOs: 119- 
127 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

In a further embodiment, the present invention provides the ability to regulate glucanases (as 
represented by SEQ ID NO: 191). Glucanases can be used to minimize wet droppings in high 
wheat, or barley, poultry and swine diets by breaking down and reducing the viscosity of B-glucans 
and other non- starch polysaccharides and thus can provide benefit as a processing aid in animal 
feed.. For uses and application of modifying crop plants by creating transgenic monocots and 
monocot seeds expressing rice B- glucanase enzymes and genes we incorporate all relevant section of 
PCT Publication WO9859046 to Rodriguez. 

The invention thus also relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide exhibiting a glucanase activity, which nucleic acid molecule is substantially 
similar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 1 92. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide which has an activity of an glucanase and is up-regulated during grain filling 
and has at least between 70%, and 99% amino acid sequence identity to at least one polypeptide of 
SEQ ID NOs: 192, with any individual number within this range of between 70% and 99% also 
being part of the invention. 
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The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide which which has an activity of a glucanase and is up- regulated during grain filling and 
immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 192. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
5 a) as given in SEQ ID NO: 191 or a part thereof which still encodes a partial- 

length polypeptide having substantially the same activity as the full-length 
polypeptide, e.g., at least 50%, more preferably at least 80%, even more 
preferably at least 90% to 95% the activity of the full-length polypeptide; 
b) having substantial similarity to (a); 
10 c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of nucleotides given in SEQ ID NO: 191 or the 
complement thereof; 

e) complementary to (a), (b) or (c); and 

15 f) which is the reverse complement of (a), (b) or (c). 



Thus, in an embodiment applicable to all of the above stated provisions, the present invention 
provides nucleotide sequences encoding at least one polypeptide involved in the synthesis, 
metabolism, transport or storage of carbohydrates, as well as any polypeptides encoded thereby, or 

20 any antigene sequences thereof, which have numerous applications using techniques that are known 
to those skilled in the art of molecular biology, biotechnology, biochemistry, genetics, physiology or 
pathology. These techniques include the use of nucleotide molecules as hybridization probes, for 
chromosome and gene mapping, in PCR technologies, in the production of sense or antisense nucleic 
acids, in screening for new therapeutic molecules, in production of plants and seeds having desirable, 

25 inheritable, commercially useful phenotypes, or in discoveiy of inhibitory compounds. 

In a further collective embodiment, the present invention provides the ability to modulate 
carbohydrates, sugars and their transporters in plant tissues, by over-expressing, under- expressing or 
knocking out one or more cell cycle genes or their gene products, in a plant cell, in vitro or in 
planta. Expression vectors comprising at least one nucleotide sequence involved in carbohydrate or 
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sugar synthesis, metabolism, transport or storage, or any antigenes thereof, operably linked to at 
least one suitable promoter and/or regulatory sequence can be used to study the role of polypeptides 
encoded by said sequences, for example by transforming a host cell with said expression vector and 
measuring the effects of overexpression and underexpression of sequences. A host cell transformed 

5 with at least one expression vector comprising nucleotide sequences involved in carbohydrate 
modulation, operably linked to suitable promoters and/or regulatory sequences, can be useful to 
produce a dietary supplement comprising a polypeptide having a defined amino acid profile. 

In a further collective embodiment, the present invention provides a transformed plant host 
cell, or one obtained through breeding, capable of over- expressing, under-expressing, or having a 

10 knock out of said metabolic genes and/or their gene products. 

Such a plant cell, transformed with at least one expression vector comprising nucleotide sequences 
involved in carbohydrate synthesis, metabolism, transport or storage, operably linked to suitable 
promoters and/or regulatory sequences, can be used to regenerate plant tissue or an entire plant, or 
seed there from, in which the effects of expression, including overexpression or underexpression, of 

15 the introduced sequence or sequences can be measured in vitro or in planta. 

A further subset of genes provided herein comprises genes that encode polypeptides with an 
activity that is involved in or associated with the production of seed storage proteins. 

In seeds of higher plants, proteins are contained in an amount of 20-30% by weight in case of 

20 beans, and in an amount of about 10% by weight in case of cereals, based on dry weight. Among the 
proteins in seeds, 70-80% by weight are storage proteins. Particularly, in rice seeds, about 80% by 
weight of the seed storage proteins is glutelin which is only soluble in dilute acids and dilute alkalis. 
The remainders are prolamin ( 1 0- 1 5% by weight) soluble in organic solvents and globulin (5- 1 0% 
by weight) solublilized by salts. 

25 Seed storage proteins are important as a protein source in foods and feeds, so that they have 

been well studied from the view points of nutrition and protein chemistry. As a result, in cereals, 
storage protein genes of maize, wheat, barley and the like have been cloned, amino acid sequences 
of the proteins have been deduced from the nucleotide sequence, and regulatory regions of the genes 
have been analyzed. 
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The present invention provides a subset of nucleic acid molecules that is up-regulated during 
grain filling and comprises a nucleotide sequence encoding a seed storage protein. Representative 
examples of these genes are given in SEQ ID NOs: 21 1 - 249. 

The invention thus also relates to a polynucleotide comprising a nucleotide sequence 
5 encoding a seed storage protein, which nucleic acid molecule is substantially similar to a nucleic acid 
encoding a polypeptide as given in any one of the SEQ ID NOs of table 8 such as SEQ ID NOs: 
212-250. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a seed storage protein which is up-regulated during grain filling and has at least between 
10 70%, and 99% amino acid sequence identity to at least one polypeptide as given in any one of the 
SEQ ID NOs of table 8 such as SEQ ID NOs: 212 - 250, with any individual number within this 
range of between 70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a seed storage protein, which is up-regulated during grain filling and immunologically reactive with 
15 antibodies raised against a polypeptide as given in any one of the SEQ ID NOs of table 8 such as 
SEQ ID NOs: 2 12 -250. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 8 such as SEQ ID NOs: 211- 
249 or a part thereof which still encodes a partial-length polypeptide having 

20 substantially the same activity as the foil- length polypeptide, e.g., at least 50%, 

more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

25 d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 

consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
ID NOs of table 8 such as SEQ ID NOs: 21 1 - 249 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 
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By providing the above subset of genes, the protein content and composition in the plant grain 
can be modified by up- or down-regulating the expression of at least one nucleic acid molecule 
within this subgroup giving rise to altered levels or an altered composition of seed storage protein in 
the plant grain. 

5 For rice grains to be processed, it is advantageous that the protein content is small. In case of 

rice to be used for preparing fermented alcoholic beverage, this can be attained through well defined 
refinement measures, thereby removing the proteins in the peripheral portion of endosperm which 
contains large amounts of storage proteins. In producing rice starch, in order to promote the purity, 
proteins are removed by treatments with alkalis, surfactants and ultrasonication. 

10 The protein content in the rice grain also influences the taste of rice. Good tasting rice grains 

have usually low contents of proteins. Rice varieties with a low protein content have been developed 
by the conventional cross-breeding or by mutation- breeding. (United States Patent 5,516,668; 
Maruta) 

US-P 5,5 1 6,668 describes a method for decreasing the amount of glutelin in plant seeds, 
15 comprising introducing into a rice plant a gene which is a template for the transcription of an antisense 
RNA a^inst rice glutelin; and transcribing said gene in seeds from said rice plant to inhibit translation 
of mRNA of glutelin, thereby decreasing the amount of glutelin in said seeds in comparison to the 
amount of glutelin contained in seeds from unmodified wild-type rice plants. 
The cDNA of glutelin which is a seed storage protein in rice has been cloned and complete primary 
20 structure of the protein has been determined by sequencing the cDNA. The gene of this protein has 
been isolated by using the cDNA as a probe (Japanese Laid-open Patent Application (Kokai) No. 
63-91085). 

Rice plants with a low glutelin content in the rice grain can now be produced more efficiently 
by down-regulating two or more of the the endogenous glutelin genes in rice seeds such as those 
25 provided in SEQ ID NOs: 223 , 235 , and 239 using methods known in the art including antisense 
and dsRNAi techniques. 

The invention thus also relates to a polynucleotide comprising a nucleotide sequence 
encoding a glutelin protein the expression of which is up-regulated during grain filling, which nucleic 
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acid molecule is substantially similar to a nucleic acid encoding a polypeptide as given in SEQ ID 
NOs: 224 , 236 , and 240. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a glutelin protein the expression of which is up-regulated during grain filling and which has 
at least between 70%, and 99% amino acid sequence identity to at least one polypeptide of SEQ ID 
NOs: 224 , 236 , and 240, with any individual number within this range of between 70% and 99% 
also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a seed glutelin protein, the expression of which is up-regulated during grain filling and which is 
immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 224 , 236 , 
and 240. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 223 , 235 , and 239 or a part thereof 
which still encodes a partial-length polypeptide having substantially the same 
activity as the full-length polypeptide, e.g., at least 50%, more preferably at least 
80%, even more preferably at least 90% to 95% the activity of the full-length 
polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 223 , 235 , and 239, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Another class of seed storage proteins are the prolamins, which are naturally rich in the 
essential amino acids lysine and methionine. Overexpressing said genes can thus increase the 
nutritional value of feeds and foods by producing said proteins at higher levels than those found in the 
unmodified wild-type plants. Another aspect of the present invention thus relates to providing genes 
that encode rice prolamin protein such as those given in SEQ ID NOs: 2 1 7, 2 1 9, 225 and 24 1 . . 
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The invention thus also relates to a polynucleotide comprising a nucleotide sequence encoding 
a prolamin protein the expression of which is up-regulated during grain filling, which nucleotide 
sequence is substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ 
ID NOs: 218, 220, 226 and 242. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a prolamin protein, the expression of which is up- regulated during grain filling and which 
has at least between 70%, and 99% amino acid sequence identity to at least one polypeptide of 
SEQ ID NOs: 218, 220, 226 and 242, with any individual number within this range of between 70% 
and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a prolamin protein, the expression of which is up-regulated during grain filling and which is 
immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 218, 220, 
226 and 242. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 2 1 7, 2 1 9, 225 and 24 1 or a part thereof 
which still encodes a partial- length polypeptide having substantially the same 
activity as the full-length polypeptide, e.g., at least 50%, more preferably at least 
80%, even more preferably at least 90% to 95% the activity of the full-length 
polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 2 1 7, 2 1 9, 225 and 24 1 , or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Gliadins are a further group of seed storage proteins that are of economic importance. Gliadin 
is a single-chained protein having an average molecular weight of about 30,000-40,000, with an 
isoelectric of pH 4.0-5.0. Gliadin proteins are extremely sticky when hydrated and have little or no 
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resistance to extension. Gliadin is responsible for giving gluten dough its characteristic cohesiveness. 

Gliadin is a premium products, when available. 

Gliadin is known to improve the freeze-thaw stability of frozen dough and also improves 

microwave stability. This product is also used as an all-natural chewing gum base replacer, a 
5 pharmaceutical binder, and improves the texture and mouth feel of pasta products and has been 

found to improve cosmetic products. 

The invention provides a further subset of genes comprising a nucleotide sequence that 

encodes gliadin storage proteins. By overexpressing said genes in the plant, but preferably in the 

plant seed, the plant produces grain with an increased concentration of gliadin as compared to the 
1 0 unmodified wild- type plant. 

In a particular embodiment , the invention thus relates to a polynucleotide comprising a 

nucleotide sequence encoding a gliadin protein, the expression of which is up-regulated during grain 

filling, which nucleotide sequence is substantially similar to a nucleic acid sequence encoding a 

polypeptide as given in SEQ ID NOs: 212, 219; 234, 248; and 250. 
15 More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 

encoding a gliadin protein, the expression of which is up-regulated during grain filling and which has 

at least between 70%, and 99% amino acid sequence identity to at least one polypeptide of SEQ ID 

NOs: 212, 219; 234, 248; and 250, with any individual number within this range of between 70% 

and 99% also being part of the invention. 
20 The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 

a seed gliadin protein, the expression of which is up-regulated during grain filling and which is 

immunologically reactive with antibodies raised against a polypeptide of SEQ ID NOs: 212, 219; 

234, 248; and 250. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
25 g) as given in any one of SEQ ID NOs: 211, 220; 233, 247; and 249 or a part 

thereof which still encodes a partial- length polypeptide having substantially the 
same activity as the full-length polypeptide, e.g., at least 50%, more preferably at 
least 80%, even more preferably at least 90% to 95% the activity of the full- 
length polypeptide; 
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h) having substantial similarity to (a); 

i) capable of hybridizing to (a) or the complement thereof; 

j) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 

consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 21 1, 220; 233, 247; and 249, or the complement thereof; 

k) complementary to (a), (b) or (c); and 

I) which is the reverse complement of (a), (b) or (c). 

In a farther embodiment the invention provides a subset of genes which encode polypeoptides 
that are involved in or associated with the metabolism of fatty acids in the rice grain. 

Seed oil content has traditionally been modified by plant breeding. The use of recombinant 
DNA technology to alter seed oil composition can accelerate this process and in some cases alter 
seed oils in a way that cannot be accomplished by breeding alone. The oil composition of Brassica 
has been significantly altered by modifying the expression of a number of lipid metabolism genes. 
Such manipulations of seed oil composition have focused on altering the proportion of endogenous 
component fatty acids. For example, antisense repression of the .DELTA.12-desaturase gene in 
transgenic rapeseed has resulted in an increase in oleic acid of up to 83%. (Topfer et al. 1 995 
Science 268:681-686). 

There have been some successful attempts at modifying the composition of seed oil in 
transgenic plants by introducing new genes that allow the production of a fatty acid that the host 
plants were not previously capable of synthesizing. Van de Loo, et al. (1995 Proc. Natl. Acad. Sci 
USA 92:6743-6747) have been able to introduce a .DELTA. 12- hydroxylase gene into transgenic 
tobacco, resulting in the introduction of a novel fatty acid, ricinoleic acid, into its seed oil. The 
reported accumulation was modest from plants carrying constructs in which transcription of the 
hydroxylase gene was under the control of the cauliflower mosaic virus (CaMV) 35S promoter. 
Similarly, tobacco plants have been engineered to produce low levels of petroselinic acid by 
expression of an acyl-ACP desaturase from coriander (Cahoon et al. 1992 Proc. Natl. Acad. Sci 
USA 89:11184-1 1188). 
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The long chain fatty acids (CI 8 and larger), have significant economic value both as 
nutritionally and medically important foods and as industrial commodities (Ohlrogge, J. B. 1994 Plant 
Physiol. 104:821-826). Linoleic (18:2 .DELTA.9,12) and .alpha.- linolenic acid (18:3 
.DELTA.9,12,15) are essential fatty acids found in many seed oils. The levels of these fatty- acids 
5 have been manipulated in oil seed crops through breeding and biotechnology (Ohlrogge, et al. 1991 
Biochim. Biophys. Acta 1082:1-26; Topfer et al. 1995 Science 268:681-686). Additionally, the 
production of novel fatty acids in seed oils can be of considerable use in both human health and 
industrial applications. 

Consumption of plant oils rich in .gamma.-linolenic acid (GLA) (18:3 .DELTA.6,9,12) is 

10 thought to alleviate hypercholesterolemia and other related clinical disorders which correlate with 
susceptibility to coronary heart disease (Brenner R. R. 1976 Adv. Exp. Med. Biol. 83:85- 101). The 
therapeutic benefits of dietary GLA may result from its role as a precursor to prostaglandin synthesis 
(Weete, J. D. 1980 in Lipid Biochemistry of Fungi and Other Organisms, eds. Plenum Press, New 
York, pp. 59-62). Linoleic acid(18:2) (LA) is transformed into gamma linolenic acid (18:3) (GLA) 

15 by the enzyme .DELTA.6-desaturase. 

Few seed oils contain GLA despite high contents of the precursor linoleic acid. This is due to 
the absence of .DELTA.6-desaturase activity in most plants. For example, only borage (Borago 
officinalis), evening primrose (Oenothera biennis), and currants (Ribes nigrum) produce appreciable 
amounts of linolenic acid. Of these three species, only Oenothera and Borage are cultivated as a 

20 commercial source for GLA. It would be beneficial if agronomic seed oils could be engineered to 
produce GLA in significant quantities by introducing a heterologous .DELTA.6-desaturase gene. It 
would also be beneficial if other expression products associated with fatty acid synthesis and lipid 
metabolism could be produced in plants at high enough levels so that commercial production of a 
particular expression product becomes feasible. 

25 As disclosed in U.S. Pat. No. 5,552,306, a cyanobacterial .DELTA..sup.6 -desaturase gene 

has been recently isolated. Expression of this cyanobacterial gene in transgenic tobacco resulted in 
significant but low level GLA accumulatioa (Reddy et al. 1996 Nature Biotech. 14:639-642). 
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The present invention now provides a subset of genes encoding polypeptides that are involved 
in or associated with fatty acid metabolism, the expression of which is up-regulated during grain 
filling 

In particular, the invention relates to a polynucleotide the expression of which is up-regulated 
during grain filling comprising a nucleotide sequence encoding a polypeptide that is involved in or 
associated with fatty acid synthesis or lipid metabolism, which nucleotide sequence is substantially 
similar to a nucleic acid sequence encoding a polypeptide as given in any one of the SEQ ID NOs of 
table 9 such as SEQ ID NOs: 252 - 280. 

More specifically, the invention relates to a polynucleotide the expression of which is up- 
regulated during grain filling comprising a nucleotide sequence encoding a polypeptide that is involved 
in or associated with fatty acid synthesis or lipid metabolism and has at least between 70%, and 99% 
amino acid sequence identity to at least one polypeptide as given in any one of the SEQ ID NOs of 
table 9 such as SEQ ID NOs: 252 - 280, with any individual number within this range of between 
70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide the expression of which is up-regulated 
during grain filling comprising a nucleotide sequence encoding a polypeptide that is involved in or 
associated with fatty acid synthesis or lipid metabolism and immunologically reactive with antibodies 
raised against a polypeptide as given in any one of the SEQ ID NOs of table 9 such as SEQ ID 
NOs: 252 - 280. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 9 such as SEQ ID NOs: 251 - 
279 or a part thereof which still encodes a partial-length polypeptide having 
substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 
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d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of nucleotides as given in any one of the SEQ ID NOs 
of table 9 such as SEQ ID NOs: 25 1 - 279 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

5 f) which is the reverse complement of (a), (b) or (c). 

By providing this subset of genes it is now possible to modify the level and composition of 
grain lipids by modulating the expression of those genes in the plant seed. Expression can be 
modulated either by introducing at least one of the nucleic acid molecules from this subset into the 
plant, preferably under control of a seed specific promoter, and overexpressing said at least one 

10 nucleic acid molecule in the plant seed, or, by down- regulating expression of the corresponding 
endogenous gene applying techniques know in the art including anti-sense and dsRNAi techniques. 

In a specific embodiment, the invention relates to a subset of genes encoding oleosins as 
represented by SEQ ID NOs: 257 and 259. 

Oleosins are abundant seed proteins associated with the phospholipid monolayer membrane of 

15 oil bodies, which are a means for storing lipids in the plant cell. Analysis of the contents of lipid 
bodies has demonstrated that in addition to triglyceride and membrane lipids, there are also several 
polypeptides/proteins associated with the surface or lumen of the oil body (Bowman- Vance and 
Huang, 1987, J. Biol. Chem., 262:1 1275- 1 1279, Murphy et al., 1989, Biochem. J., 258:285-293, 
Taylor et al, 1990, Planta, 181 : 18-26). Oil- body proteins have been identified in a wide range of 

20 taxonomically diverse species (Moreau et al., 1980, Plant Physiol., 65:1 176-1 180; Qu et al., 1986, 
Biochem. J., 235:57-65) and have been shown to be uniquely localized in oil-bodies and not found 
in organelles of vegetative tissues. In Brassica napus (rapeseed, canola) there are at least three 
polypeptides associated with the oil-bodies of developing seeds (Taylor et al., 1990, Planta, 181:18- 
26). 

25 One of the most abundant proteins associated with the phospholipid monolayer membrane of 

oil bodies are the oleosins. The first oleosin gene, L3, was cloned from maize by selecting clones 
whose in vitro translated products were recognized by an anti-L3 antibody (Vance et al. 1987 J. 
Biol. Chem. 262:1 1275-1 1279). Subsequently, different isoforms of oleosin genes from such 
different species as Brassica, soybean, carrot, pine, and Arabidopsis have been cloned (Huang, A. 
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H. C, 1992, Ann. Reviews Plant Phys. and Plant Mol. Biol. 43:177-200; Kirik et al, 1996 Plant 
Mol. Biol. 31:413-417; Van Rooijen et al., 1992 Plant Mol. Biol. 18:1 177-1 179; Zou et al., Plant 
Mol. Biol. 31:429-433. Oleosin protein sequences predicted from these genes are highly conserved, 
especially for the central hydrophobic domain. All of these oleosins have the characteristic feature of 
5 three distinctive domains. An amphipathic domain of 40-60 amino acids is present at the N-terminus; 
a totally hydrophobic domain of 68-74 amino acids is located at the center, and an amphipathic 
.alpha.- helical domain of 33-40 amino acids is situated at the C- terminus (Huang, A. H. C. 1992). 

A maize oleosin has been expressed in seed ofl bodies in Brassica napus transformed with a 
Zea mays oleosin gene. The gene was expressed under the control of regulatory elements from a 

10 Brassica gene encoding napin, a major seed storage protein. The temporal regulation and tissue 

specificity of expression was reported to be correct for a napin gene promoter/terminator (Lee et al., 
1991, Proc. Natl. Acad. Sci. U.S.A., 88:6181-6185). 

By providing a subset of genes encoding oleosins, it is now possible to modify the oleosin 
content in the phospholipid monolayer membrane of oil bodies by either introducing the genes 

15 provided herein into a plant and overexpressing said gene in said plant or, in the alternative, by 
down- regulating expression of the endogenous oleosin encoding genes in the plant using method 
known in the art including anti- sense or dsRNAi techniques. 

In one specific embodiment, the present invention thus relates to a polynucleotide comprising 
a nucleotide sequence encoding an oleosin protein, which nucleotide sequence is substantially similar 

20 to a nucleic acid sequence encoding a polypeptide as given in SEQ ID NOs: 258 and 260. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding an oleosin protein, which is up-regulated during grain filling and has at least between 70%, 
and 99% amino acid sequence identity to at least one polypeptide of SEQ ID NOs: 258 and 260, 
with any individual number within this range of between 70% and 99% also being part of the 

25 invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
an oleosin protein, which is up-regulated during grain filling and immunologically reactive with 
antibodies raised against a polypeptide of SEQ ID NOs: 258 and 260. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
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a) as given in any one of SEQ ID NOs: 257 and 259 or a part thereof which still 
encodes a partial-length polypeptide having substantially the same activity as the 
fiill- length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 257 and 259, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

At least one of the genes provided herein, which is up- regulated during grain filling, encodes a 
phytoene dehydrogenase polypeptide that is involved in carotenoid biosynthesis and can thus be 
used to modify caroteinoid production in grain. 

Carotenoids are natural pigments that are essential to microbial, plant, and animal life. In 
photosynthetic organisms, they act as potent antioxidants that negate the lethal effects of singlet 
oxygen and superoxide formed during oxygen production. As human dietary constituents, these 
lipophilic antioxidants provide our cells with chemical protectants against the damaging effects of 
oxidation. Acting as chemical scavengers, carotenoids play roles in the prevention of cancer and 
chronic maladies, including heart disease. 

Phytoene (7,8,1 1,12,7',8',1 r,12 -.omega, octahydro-.omega., .omega.-carotene) is the first 
carotenoid in the carotenoid biosynthesis pathway and is produced by the dimerization of a 20- 
caibon atom precursor, geranylgeranyl pyrophosphate (GGPP). Phytoene has useful applications in 
treating skin disorders (U.S. Pat. No. 4,642,31 8) and is itself a precursor for colored carotenoids. 
Aside from certain mutant organisms, such as Phycomyces blakesleeanus carB, no current methods 
are available for producing phytoene via any biological process. 

In some organisms, the red carotenoid lycopene (.omega.,.omega.-carotene) is the next 
carotenoid produced in the phytoene in the pathway. Lycopene imparts the characteristic red color 
to ripe tomatoes. 
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Lycopene has utility as a food colorant. It is also an intermediate in the biosynthesis of other 
carotenoids in some bacteria, fungi and green plants. 

Lycopene is prepared biosynthetically from phytoene through four sequential dehydrogenation 
reactions by the removal of eight atoms of hydrogen. The enzymes that remove hydrogen from 
5 phytoene are phytoene dehydrogenases. One or more phytoene dehydrogenases can be used to 
convert phytoene to lycopene and dehydrogenated derivatives of phytoene intermediate to lycopene 
are also known. For example, some strains of Rhodobacter sphaeroides contain a phytoene 
dehydrogenase that removes six atoms of hydrogen from phytoene to produce neurosporene. 

Lycopene is an intermediate in the biosynthesis of carotenoids in some bacteria, fungi, and aD 
10 green plants. Carotenoid-specific genes that can be used for synthesis of lycopene from the 
ubiquitous precursor famesyl pyrophosphate include those for the enzymes GGPP synthase, 
phytoene synthase, and phytoene dehydrogenase-4H. 

In one specific embodiment the present invention relates to a polynucleotide comprising a 
nucleotide sequence encoding a polypeptide the activity of which is involved in or associated with the 
15 dehydrogenation of phytoene and the expression of which is up-regulated during grain filling, which 
nucleotide sequence is substantially similar to a nucleic acid sequence encoding a polypeptide as 
given in SEQIDNO: 278. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide the activity of which is involved in or associated with the dehydrogenation of 
20 phytoene and the expression of which is up-regulated during grain filling and which has at least 

between 70%, and 99% amino acid sequence identity to at least one polypeptide of SEQ ID NOs: 
278, with any individual number within this range of between 70% and 99% also being part of the 
invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
25 a polypeptide the activity of which is involved in or associated with the dehydrogenation of phytoene 
and the expression of which is up- regulated during grain filling and which is immunologically reactive 
with antibodies raised against a polypeptide of SEQ ID NOs: 278. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 
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a) as given in any one of SEQ ID NOs: 277 or a part thereof which still encodes a 
partial-length polypeptide having substantially the same activity as the full-length 
polypeptide, e.g., at least 50%, more preferably at least 80%, even more 
preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 277, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

Another subset of genes that is provided as part of the invention comprises nucleic acid 
molecules that are involved in the transcriptional control of the highly coordinated grain filling 
process. 

Transcription factors are proteins that bind to the enhancer or promoter regions and interact 
such that transcription occurs mom only a small group of promoters in any cell. Most transcription 
factors can bind to specific DNA sequences, and these trans- regulatory proteins can be grouped 
together in families based on similarities in structure. Within such a family, proteins share a common 
framework structure in their respective DNA- binding sites, and slight differences in the amino acids 
at the binding site can alter the sequence of the DNA to which it binds. In addition to having this 
sequence-specific DNA-binding domain, transcription factors contain a domain involved in activating 
the transcription of the gene whose promoter or enhancer it has bound. Usually, this trans- activating 
domain enables that transcription factor to interact with proteins involved in binding RNA 
polymerase. This interaction often enhances the efficiency with which the basal transcriptional 
complex can be built and bind RNA polymerase EL There are several families of transcription 
factors, and those discussed here are just some of the main types. 

The gene subset provided herein includes a gene which encodes a polypeptide that is similar 
to the CREB-binding protein from Mus sp (as represented by SEQ ID NO: 301), and is highly 
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expressed in aleurone and endosperm tissues during grain filling. CREB-binding protein (CBP) is a 
necessary component of the CREB/PKA paradigm of gene regulation. The acetylation of histones 
and other proteins has been linked to gene regulation, and CBP has a potent intrinsic 
acetyltransferase (AT) enzymatic domain. CREB belongs to a class of proteins whose 
5 phosphorylation appears specifically to enhance their trans-activation potential (Arias J, et al Nature 
1994 Jul 21;370(6486):226-9). 

CBP possesses intrinsic histone acetyltransferase activity, and can acetylate not only histones 
but also certain transcriptional factors such as GATA1 ; p53 and also myb-type transcription factors 
such as c-Myb (Yuji Sano and Shunsuke Ishii J. Biol. Chem., Vol. 276, Issue 5, 3674-3682, 
10 February 2, 2001). Acetylationof c-Myb by CBP increases the fra/7S-activating capacity of c-Myb 
by enhancing its association with CBP. These results demonstrate a novel molecular mechanism of 
regulation of c-Myb activity. 

In rice, 70 known and putative MYB genes could be identified, some of which show 
interesting expression patterns such as those given in SEQ ID NOs: 311-321. The expression 
15 pattern of these transcription factors suggests that they play a key role during rice grain filling. 

Another transcription factor gene (as represented by SEQ ID NOs: 305) included in this 
subset encodes a protein that has structural similarity to the yeast HAPS transcriptional activator 
protein. In yeast, the HAP5 protein is a component of the HAP (Hap2p-Hap3p-Hap4p-Hap5p) 
CCAAT-box-binding transcriptional activation complex and is essential for the binding activity of the 
20 complex. 

A further transcription factor gene within this subset is represented by SEQ ID NO: 307 which 
encodes a bZlP-type transcription factor similar to the plant G-box binding factorGBF4, that was 
found in Arabidopsis. GBF4, in a manner reminiscent of the Fos-related oncoproteins of mammalian 
systems, cannot bind to DNA as a homodimer, although it contains a basic region capable of 
25 specifically recognizing the G-box and G-box- like elements. However, GBF4 can interact with 
GBF2 and GBF3 to bind DNA as heterodimers. Mutagenesis of the leucine zipper of GBF4 
indicates that the mutation of a single amino acid confers upon the protein the ability to recognize the 
G-box as a homodimer, apparently by altering the charge distribution within the leucine zipper (AE 
Menkens and ARCashmore (1994) PNAS 91: 2522-2526). 
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Another of the transcription factor genes within this subset encodes a protein that has a zinc 
finger domain and is similar to a zinc-finger type transcription factor found in Arabidopsis 
(gi|6899934). 

Zinc finger proteins include WT- 1 (a important transcription factor critical in the formation of 
5 the kidney and gonads); the ubiquitous transcription factor Spl ; Xenopus 5S rRNA transcription 
factor TFII1A; Krox 20 (a protein that regulates gene expression in the developing hindbrain); Egr- 1 
(which commits white blood cell development to the macrophage lineage); Kriippel (a protein that 
specifes abdominal cells in Drosophila); and numerous steroid-binding transcription factors. Each of 
these proteins has two or more "DNA-binding fingers," a- helical domains whose central amino acids 
10 tend to be basic. These domains are linked together in tandem and are each stabilized by a centrally 
located zinc ion coordinated by two cysteines (at the base of the helix) and two internal histidines. 
The crystal structure shows that the zinc fingers bind in the major groove of the DNA. 

The expression pattern of these transcription factors during grain filling suggests that they play 
a key role during rice grain development. This is further supported by the fact that the AACA 
15 promoter element, which is known to be conserved in many seed storage protein genes, is over- 
represented in the promoters of the grain filling sub- set genes according to the invention. This subset 
comprises genes the protein products of which are involved in diverse cellular functions, including 
carbohydrate, protein and fatty acid metabolism, nutrient transportation, and transcription and 
translation. The ACCA promoter element was thus demonstrated to be likely one of the key 
20 elements in the coordination of different major pathways during grain development 

In one embodiment the invention thus relates to a polynucleotide comprising a nucleotide 
sequence that encodes a polypeptide that acts as a transcription factor and the expression of which is 
up- regulates during grain filling, which nucleotide sequence is substantially similar to a nucleic acid 
sequence encoding a polypeptide as given in any one of the SEQ ID NOs of table 1 1 such as SEQ 
25 ID NOs: 302-328. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encodes a polypeptide that acts as a transcription factor and the expression of which is up-regulated 
during grain filling and which has at least between 70%, and 99% amino acid sequence identity to at 
least one polypeptide as given in any one of the SEQ ID NOs of table 1 1 such as SEQ ID NOs: 
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302-328, with any individual number within this range of between 70% and 99% also being part of 
the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encodes a 
polypeptide that acts as a transcription factor and the expression of which is up- regulated during 
grain filling and which is immunologically reactive with antibodies raised against a polypeptide as 
given in any one of the SEQ ID NOs of table 1 1 such as SEQ ID NOs: 302-328. 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 1 1 such as SEQ ID NOs: 301- 
327 or a part thereof which still encodes a partial-length polypeptide having 
substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full- length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
IDNOsoftable 11 such as SEQ ID NOs: 301-327, or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

By changing the expression level and/or pattern of at least one transcription factor as 
provided herein, which is involved in the regulation and coordination of grain filling in plants, it is 
possible to modify the grain filling process to obtain grain with a modified nutritional composition 
and/or quality characteristics. 

A further subset of genes which is provided herein comprises genes encoding polypeptides the 
activity of which is involved in or associated with amino acid metabolism. 

In particular, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide the activity of which is involved or associated with the metabolism of amino 
acids and the expression of which is up-regulated during grain filling, which nucleotide sequence is 
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substantially similar to a nucleic acid sequence encoding a polypeptide as given in any one of the 
SEQ ID NOs of table 10 such as SEQ ID NOs: 282 - 300. 

More specifically, the invention relates to a polynucleotide comprising a nucleotide sequence 
encoding a polypeptide the activity of which is involved or associated with the metabolism of amino 
acids and the expression of which is up-regulated during grain filling, which polypeptide has at least 
between 70%, and 99% amino acid sequence identity to at least one polypeptide as given in any one 
of the SEQ ID NOs of table 10 such as SEQ ID NOs: 282 - 300, with any individual number within 
this range of between 70% and 99% also being part of the invention. 

The invention further relates to a polynucleotide comprising a nucleotide sequence encoding 
a polypeptide the activity of which is involved or associated with the metabolism of amino acids and 
the expression of which is up-regulated during grain filling, which polypeptide is immunologically 
reactive with antibodies raised against a polypeptide as given in any one of the SEQ ID NOs of table 
1 0 such as SEQ ID NOs: 282 - 300. ■ 

More particularly, the invention relates to a polynucleotide comprising a nucleotide sequence 

a) as given in any one of the SEQ ID NOs of table 10 such as SEQ ID NOs: 281 - 
299 or a part thereof which still encodes a partiaUength polypeptide having 
substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence as given in any one of the SEQ 
ID NOs of table 10 such as SEQ ID NOs: 281 - 299, or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 
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In a final embodiment, the present invention provides a subset of genes encoding polypeptides 
for which no biological function is known so far. It is within the scope of this invention, that the 
expression products of these genes, respresentative examples of which are provided in column B of 
table 3, can for the first time be associated with a biological function. Based on their mRNA 
5 expression characteristics and their specific expression pattern during grain filling it is suggested that 
they are involved in or associated with nutrient partitioning during the grain filling process. 

By modifying the expression of at least one of the genes within this subgroup it is, therefore, 
possible to modify the compositional characteristics and thus the nutritional properties of the plant 
grain. 

10 

The present invention provides a set of genes, which were shown to be preferentially up- 
regulated and to share a similar expression pattern during the process of grain filling as specified 
hereinbefore. The genes within this subgroup are useful tools for generating plants which produce 
grain with modified compositional characteristics leading to improved nutritional properties 

15 According to one embodiment, the present invention is directed to a nucleic acid molecule 

comprising a nucleotide sequence isolated or obtained from any plant which encodes a polypeptide 
that has at least 70% amino acid sequence identity to a polypeptide encoded by a gene comprising 
any one of SEQ ID NOs provided in the Sequence Listing. 

Based on the Oryza nucleic acid sequences of the present invention as given in the SEQ ID 

20 NOs of the Sequence Listing, orthologs may be identified or isolated from the genome of any desired 
organism, preferably from another plant, according to well known techniques based on their 
sequence similarity to the Oryza nucleic acid sequences, e.g., hybridization, PCR or computer 
generated sequence comparisons. For example, all or a portion of a particular Oryza nucleic acid 
sequence is used as a probe that selectively hybridizes to other gene sequences present in a 

25 population of cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) 
from a chosen source organism. Further, suitable genomic and cDNA libraries may be prepared 
from any cell or tissue of an organism. Such techniques include hybridization screening of plated 
DNA libraries (either plaques or colonies; see, e.g., Sambrook et al., 1989) and amplification by 
PCR using oligonucleotide primers preferably corresponding to sequence domains conserved among 
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related polypeptide or subsequences of the nucleotide sequences provided herein (see, e.g., Innis et 
al., 1990). These methods are particularly well suited to the isolation of gene sequences from 
organisms closely related to the organism from which the probe sequence is derived. The application 
of these methods using the Oryza sequences as probes is well suited for the isolation of gene 

5 sequences from any source organism, preferably other plant species. In a PCR approach, 

oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA 
sequences from cDNA or genomic DNA extracted from any plant of interest. Methods for 
designing PCR primers and PCR cloning are generally known in the art. 

In hybridization techniques, all or part of a known nucleotide sequence is used as a probe 

10 that selectively hybridizes to other corresponding nucleotide sequences present in a population of 
cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) from a 
chosen organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, 
RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32 P, 
or any other detectable marker. Thus, for example, probes for hybridization can be made by 

15 labeling synthetic oligonucleotides based on the sequence of the invention. Methods for preparation 
of probes for hybridization and for construction of cDNA and genomic libraries are generally known 
in the art and are disclosed in Sambrook et al. (1989). In general, sequences that hybridize to the 
sequences disclosed herein will have at least 40% to 50%, about 60% to 70% and even about 80% 
85%, 90%, 95% to 98% or more identity with the disclosed sequences. That is, the sequence 

20 similarity of sequences may range, sharing at least about 40% to 50%, about 60% to 70%, and even 
about 80%, 85%, 90%, 95% to 98% sequence similarity, with each individual number within the 
ranges given above also being part of the invention. 

The nucleic acid molecules of the invention can also be identified by, for example, a search of 
known databases for genes encoding polypeptides having a specified amino acid sequence identity 

25 or DNA having a specified nucleotide sequence identity. Methods of alignment of sequences for 
comparison are well known in the art and are described hereinabove. 

In a further embodiment , the invention provides isolated nucleic acid molecules comprising a 
plant nucleotide sequence that induces transcription of a linked nucleic acid segment in a plant or 
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plant cell, e.g., a linked nucleic acid molecule comprising an open reading frame for or encoding a 
structural or regulatoiy gene, in a tissue specific or tissue preferential manner. 

In a specific embodiment, the invention . provides isolated nucleic acid molecules comprising a 
plant nucleotide sequence that induces transcription of a linked nucleic acid segment in a plant or 
plant cell, e.g., a linked nucleic acid molecule comprising an open reading frame for or encoding a 
structural or regulatory gene, in a seed- specific or seed- preferential manner. In particular, the plant 
nucleotide sequence according to the invention is substantially less active in vegetative tissue as 
compared to seed and is most active in the endosperm. . The transcription inducing activity icreases 
during seed development and reaches its peak at or around the time of grain filling. 

In particular, the nucleotide sequence of the invention directs seeds- (e.g. endosperm-) 
specific or seeds- (e.g. endosperm-) preferential transcription of a linked nucleic acid segment in a 
plant or plant cell and is preferably obtained or obtainable from plant genomic DNA having a gene 
comprising an open reading frame (ORP) encoding a polypeptide which is substantially similar, and 
preferably has at least 70%, e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, and even 90% or more, e.g., 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, and 99%, amino acid sequence identity, to a polypeptide encoded by 
an Oryza, e.g., Oryza sativa, gene comprising any one of SEQ ID NOs: 2 - 462 (e.g., including a 
promoter obtained or obtainable from any one of SEQ ID NOs: 643 - 883) which directs seed- 
specific (or seed-preferential) transcription of a linked nucleic acid segment. 

The promoters of the invention include a consecutive stretch of about 25 to 2000, including 
50 to 500 or 100 to 250, and up to 1000 or 1500, contiguous nucleotides, e.g., 40 to about 750, 
60 to about 750, 125 to about 750, 250 to about 750, 400 to about 750, 600 to about 750, of any 
one of SEQ ID NOs: 643 - 883, or the promoter orthologs thereof, which include the minimal 
promoter region. 

In a particular embodiment of the invention said consecutive stretch of about 25 to 2000, 
including 50 to 500 or 100 to 250, and up to 1000 or 1500, contiguous nucleotides, e.g., 40 to 
about 750, 60 to about 750, 1 25 to about 750, 250 to about 750, 400 to about 750, 600 to about 
750, has at least 75%, preferably 80%, more preferably 90% and most preferably 95%, nucleic acid 
sequence identity with a corresponding consecutive stretch of about 25 to 2000, including 50 to 500 
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or 100 to 250, and up to 1000 or 1500, contiguous nucleotides, e.g., 40 to about 750, 60 to about 
750, 125 to about 750, 250 to about 750, 400 to about 750, 600 to about 750, of any one of SEQ 
ID NOs: 643 - 883 or the promoter orthologs thereof, which include the minimal promoter region. 
The above defined stretch of contiguous nucleotides preferably comprises one or more promoter 
5 motifs, e.g., for seed-specific promoters, motifs selected from the group consisting of the P box and 
GCNA elements, including but not limited to TGTAAAG and TGA(G/C)TCA.and a transcription 
start site. 

In case of promoters directing tissue-specific transcription of a linked nucleic acid segment in 
a plant or plant cell such as, for example, a promoter directing seed- specific or seed-preferential, but 

10 especially endosperm-specific or endosperm-preferential transcription, it is further preferred that 
previously defined stretch of contiguous nucleotides comprises further motifs that participate in the 
tissue specificity of said stretch(es) of nucleotides. 

Generally, the promoters of the invention may be employed to express a nucleic acid segment 
that is operably linked to said promoter such as, for example, an open reading frame, or a portion 

15 thereof, an anti- sense sequence, or a transgene in plants. The open reading frame may be obtained 
from an insect resistance gene, a disease resistance gene such as, for example, a bacterial disease 
resistance gene, a fungal disease resistance gene, a viral disease resistance gene, a nematode disease 
resistance gene, a herbicide resistance gene, a gene afFecting grain composition or quality, a nutrient 
utilization gene, a mycotoxin reduction gene, a male sterility gene, a selectable marker gene, a 

20 screenable marker gene, a negative selectable marker, a positive selectable marker, a gene affecting 
plant agronomic characteristics, i.e., yield, standability, and the like, or an environment or stress 
resistance gene, i.e., one or more genes that confer herbicide resistance or tolerance, insect 
resistance or tolerance, disease resistance or tolerance (viral, bacterial, fungal, oomycete, or 
nematode), stress tolerance or resistance (as exemplified by resistance or tolerance to drought, heat, 

25 chilling, freezing, excessive moisture, salt stress, or oxidative stress), increased yields, food content 
and makeup, physical appearance, male sterility, drydown, standability, prolificacy, starch properties 
or quantity, oil quantity and quality, amino acid or protein composition, and the like. By "resistant* ' is 
meant a plant which exhibits substantially no phenotypic changes as a consequence of agent 
administration, infection with a pathogen, or exposure to stress. By "tolerant" is meant a plant which, 
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although it may exhibit some phenotypic changes as a consequence of infection, does not have a 
substantially decreased reproductive capacity or substantially altered metabolism. 

For instance, seed-specific promoters may be useful for expressing genes as well as for 
producing large quantities of protein, for expressing oils or proteins of interest, e.g., antibodies, genes 

5 for increasing the nutritional value of the seed and the like. In particular, the seed- specific or seed- 
preferential promoters accroding to the invention such as those provided in SEQ ID NOs: 643 - 
883 may be useful for expressing the Open Reading Frames which are represented by the nucleotide 
sequences of SEQ ID NOs: 1-461 and 501 - 51 1, respectively. 

Obtaining sufficient levels of transgene expression in the appropriate plant tissues is an 

10 important aspect in the production of genetically engineered crops. Expression of heterologous 

DNA sequences in a plant host is dependent upon the presence of an operably linked promoter that 
is functional within the plant host. Choice of the promoter sequence will determine when and where 
within the organism the heterologous DNA sequence is expressed. 

It is specifically contemplated by the present invention that one could use any one of the 

15 promoters according to the present invention in unaltered or altered form. Mutagenization of a 
promoter of the present invention such as those provided in SEQ ID NOs: 643 - 883 may 
potentially improve the utility of the elements for the expression of transgenes in plants. The 
mutagenesis of these elements can be carried out at random and the mutagenized promoter 
sequences screened for activity in a trial-by-error procedure. 

20 Alternatively, particular sequences which provide the promoter with desirable expression 

characteristics, or the promoter with expression enhancement activity, could be identified and these 
or similar sequences introduced into the sequences via mutation. It is further contemplated that one 
could mutagenize these sequences in order to enhance their expression of transgenes in a particular 
species. 

25 The means for mutagenizing a DNA segment encoding a promoter sequence of the current 

invention are well-known to those of skill in the art. As indicated, modifications to promoter or other 
regulatory element may be made by random, or site- specific mutagenesis procedures. The promoter 
and other regulatory element may be modified by altering their structure through the addition or 
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deletion of one or more nucleotides from the sequence which encodes the corresponding un- 
modified sequences. 

Mutagenesis may be performed in accordance with any of the techniques known in the art, 
such as, and not limited to, synthesizing an oligonucleotide having one or more mutations within the 

5 sequence of a particular regulatory region. In particular, site-specific mutagenesis is a technique 
useful in the preparation of promoter mutants, through specific mutagenesis of the underlying DNA. 
The technique further provides a ready ability to prepare and test sequence variants, for example, 
incorporating one or more of the foregoing considerations, by introducing one or more nucleotide 
sequence changes into the DNA. Site-specific mutagenesis allows the production of mutants through 

10 the use of specific oligonucleotide sequences which encode the DNA sequence of the desired 
mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of 
sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction 
being traversed. Typically, a primer of about 1 7 to about 75 nucleotides or more in length is 
preferred, with about 10 to about 25 or more residues on both sides of the junction of the sequence 

15 being altered. 

In general, the technique of site-specific mutagenesis is well known in the art, as exemplified 
by various publications. As will be appreciated, the technique typically employs a phage vector 
which exists in both a single stranded and double stranded form. Typical vectors useful in site- 
directed mutagenesis include vectors such as the Ml 3 phage. These phage are readily commercially 
20 available and their use is generally well known to those skilled in the art. 

Double stranded plasmids also are routinely employed in site directed mutagenesis which 
eliminates the step of transferring the gene of interest from a plasmid to a phage. 

In general, site- directed mutagenesis in accordance herewith is performed by first obtaining a 
single- stranded vector or melting apart of two strands of a double stranded vector which includes 
25 within its sequence a DNA sequence which encodes the promoter. An oligonucleotide primer 
bearing the desired mutated sequence is prepared, generally synthetically. This primer is then 
annealed with the single- stranded vector, and subjected to DNA polymerizing enzymes such as £. 
coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing 
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strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated 
sequence and the second strand bears the desired mutation. 

This heteroduplex vector is then used to transform or transfect appropriate cells, such as E. 
coli cells, and cells are selected which include recombinant vectors bearing the mutated sequence 
5 arrangement. Vector DNA can then be isolated from these cells and used for plant transformation. A 
genetic selection scheme is devised by Kunkel et al. (1987) to enrich for clones incorporating 
mutagenic oligonucleotides. Alternatively, the use of PCR with commercially available thermostable 
enzymes such as Taq polymerase may be used to incorporate a mutagenic oligonucleotide primer 
into an amplified DNA fragment that can then be cloned into an appropriate cloning or expression 

10 vector. The PCR- mediated mutagenesis procedures of Tomic et al. (1 990) and Upender et al. 

(1995) provide two examples of such protocols. A PCR employing a thermostable ligase in addition 
to a thermostable polymerase also may be used to incorporate a phosphorylated mutagenic 
oligonucleotide into an amplified DNA fragment that may then be cloned into an appropriate cloning 
or expression vector. The mutagenesis procedure described by Michael (1994) provides an example 

15 of one such protocol. 

The preparation of sequence variants of the selected promoter-encoding DNA segments 
using site-directed mutagenesis is provided as a means of producing potentially useful species and is 
not meant to be limiting as there are other ways in which sequence variants of DNA sequences may 
be obtained. For example, recombinant vectors encoding the desired promoter sequence may be 

20 treated with mutagenic agents, such as hydroxylamine, to obtain sequence variants. 

As used herein, the term "oligonucleotide directed mutagenesis procedure" refers to 
template-dependent processes and vector-mediated propagation which result in an increase in the 
concentration of a specific nucleic acid molecule relative to its initial concentration, or in an increase 
in the concentration of a detectable signal, such as amplification. As used herein, the term 

25 "oligonucleotide directed mutagenesis procedure" also is intended to refer to a process that involves 
the template-dependent extension of a primer molecule. The term template-dependent process refers 
to nucleic acid synthesis of an RNA or a DNA molecule wherein the sequence of the newly 
synthesized strand of nucleic acid is dictated by the well- known rules of complementary base pairing 
(see, for example, Watson and Rarnstad, 1987). Typically, vector mediated methodologies involve 
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the introduction of the nucleic acid fragment into a DNA or RNA vector, the clonal amplification of 
the vector, and the recovery of the amplified nucleic acid fragment. Examples of such methodologies 
are provided by U.S. Patent No. 4,237,224. A number of template dependent processes are 
available to amplify the target sequences of interest present in a sample, such methods being well 
5 known in the art and specifically disclosed herein below. 

Where a clone comprising a promoter has been isolated in accordance with the instant 
invention, one may wish to delimit the essential promoter regions within the clone. One efficient, 
targeted means for preparing mutagenizing promoters relies upon the identification of putative 
regulatory elements within the promoter sequence. This can be initiated by comparison with 

10 promoter sequences known to be expressed in similar tissue-specific or developmentally unique 
manner. Sequences which are shared among promoters with similar expression patterns are likely 
candidates for the binding of transcription factors and are thus likely elements which confer 
expression patterns. Confirmation of these putative regulatory elements can be achieved by deletion 
analysis of each putative regulatory region followed by functional analysis of each deletion construct 

15 by assay of a reporter gene which is functionally attached to each construct. As such, once a starting 
promoter sequence is provided, any of a number of different deletion mutants of the starting 
promoter could be readily prepared. 

As indicated above, deletion mutants, deletion mutants of the promoter of the invention also 
could be randomly prepared and then assayed. With this strategy, a series of constructs are 

20 prepared, each containing a different portion of the clone (a subclone), and these constructs are then 
screened for activity. A suitable means for screening for activity is to attach a deleted promoter or 
intron construct which contains a deleted segment to a selectable or screenable marker, and to 
isolate only those cells expressing the marker gene. In this way, a number of different, deleted 
promoter constructs are identified which still retain the desired, or even enhanced, activity. The 

25 smallest segment which is required for activity is thereby identified through comparison of the 

selected constructs. This segment may then be used for the construction of vectors for the expression 
of exogenous genes. 

Furthermore, it is contemplated that promoters combining elements from more than one 
promoter may be useful. For example, U.S. Patent No. 5,491 ,288 discloses combining a 
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Cauliflower Mosaic Virus promoter with a histone promoter. Thus, the elements from the promoters 
disclosed herein may be combined with elements from other promoters 

The present invention further provides a composition, an expression cassette or a 
5 recombinant vector containing the nucleic acid molecule of the invention as discosed herinbefore, and 
host cells comprising the expression cassette or vector, e.g., comprising a plasmid. 

In particular, the present invention provides an expression cassette or a recombinant vector 
comprising a suitable promoter linked to a nucleic acid segment of the invention, representative 
examples of which are provided in the SEQ ID NOs of the Sequence Listing, which, when present in 
10 a plant, plant cell or plant tissue, results in transcription of the linked nucleic acid segment. 

Promoters which are useful for plant transgene expression include those that are inducible, 
viral, synthetic, constitutive (Odell et al., 1985), temporally regulated, spatially regulated, tissue- 
specific, and spatio-temporally regulated. 

Where expression in specific tissues or organs is desired, tissue-specific promoters may be 
15 used. In contrast, where gene expression in response to a stimulus is desired, inducible promoters 
are the regulatory elements of choice. Where continuous expression is desired throughout the cells of 
a plant, constitutive promoters are utilized. Additional regulatory sequences upstream and/or 
downstream from the core promoter sequence may be included in expression constructs of 
transformation vectors to bring about varying levels of expression of heterologous nucleotide 
20 sequences in a transgenic plant. 

Suitable promoter and/or regulatory sequences further include those that are preferentially or 
specifically active in plant grain tissue such as, for example, the grain endosperm or the grain embryo. 

Further, the invention provides isolated polypeptides encoded by any one of the open 
reading frames of the invention, representative examples of which are provided in the SEQ ID NOs 
25 of the Sequence Listing, or a fragment thereof, which encodes a polypeptide which has substantially 
the same activity as the corresponding polypeptide encoded by an ORF given in the SEQ ID NOs 
of the Sequence Listing, or the orthologs thereof 

Virtually any DNA composition may be used for delivery to recipient plant cells, e.g., 
monocotyledonous cells, to ultimately produce fertile transgenic plants in accordance with the present 
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invention. For example, DNA segments or fragments in the form of vectors and plasmids, or linear 
DNA segments or fragments, in some instances containing only the DNA element to be expressed in 
the plant, and the like, may be employed. The construction of vectors which may be employed in 
conjunction with the present invention will be known to those of skill of the art in light of the present 
5 disclosure (see, e.g., Sambrook et al., 1989; Gelvin et al., 1990). 

It is one of the objects of the present invention to provide recombinant DNA molecules 
comprising a nucleotide sequence which directs transcription according to the invention operably 
linked to a nucleic acid segment or sequence of interest. 

The nucleic acid segment of interest can, for example, code for a ribosomal RNA, an 

10 antisense RNA or any other type of RNA that is not translated into protein. In another preferred 
embodiment of the invention, the nucleic acid segment of interest is translated into a protein product. 
The nucleotide sequence which directs transcription and/or the nucleic acid segment may be of 
homologous or heterologous origin with respect to the plant to be transformed. A recombinant DNA 
molecule useful for introduction into plant cells includes that which has been derived or isolated from 

15 any source, that may be subsequently characterized as to structure, size and/or function, chemically 
altered, and later introduced into plants. An example of a nucleotide sequence or segment of interest 
"derived" from a source, would be a nucleotide sequence or segment that is identified as a useful 
fragment within a given organism, and which is then chemically synthesized in essentially pure form. 
An example of such a nucleotide sequence or segment of interest "isolated" from a source, would be 

20 nucleotide sequence or segment that is excised or removed from said source by chemical means, 
e.g., by the use of restriction endonucleases, so that it can be further manipulated, e.g., amplified, for 
use in the invention, by the methodology of genetic engineering. Such a nucleotide sequence or 
segment is commonly referred to as "recombinant." 

Therefore a useful nucleotide sequence, segment or fragment of interest includes completely 

25 synthetic DNA, semi-synthetic DNA, DNA isolated from biological sources, and DNA derived from 
introduced RNA. Generally, the introduced DNA is not originally resident in the plant genotype 
which is the recipient of the DNA, but it is within the scope of the invention to isolate a gene from a 
given plant genotype, and to subsequently introduce multiple copies of the gene into the same 
genotype, e.g., to enhance production of a given gene product such as a storage protein or a protein 
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that is involved in carbohydrate metabolism or any other gene of interest as provided in the SEQ ID 
NOs of the sequence listing. 

The introduced recombinant DNA molecule includes but is not limited to, DNA from plant 
genes, and non-plant genes such as those from bacteria, yeasts, animals or viruses. The introduced 
5 DNA can include modified genes, portions of genes, or chimeric genes, including genes from the 
same or different genotype. The term "chimeric gene" or "chimeric DNA" is defined as a gene or 
DNA sequence or segment comprising at least two DNA sequences or segments from species which 
do not combine DNA under natural conditions, or which DNA sequences or segments are 
positioned or linked in a manner which does not normally occur in the native genome of 

10 untransformed plant. 

The introduced recombinant DNA molecule used for transformation herein may be circular 
or linear, double-stranded or single-stranded. Generally, the DNA is in the form of chimeric DNA, 
such as plasmid DNA, that can also contain coding regions flanked by regulatory sequences which 
promote the expression of the recombinant DNA present in the resultant plant. 

15 Generally, the introduced recombinant DNA molecule will be relatively small, i.e., less than 

about 30 kb to minimize any susceptibility to physical, chemical, or enzymatic degradation which is 
known to increase as the size of the nucleotide molecule increases. As noted above, the number of 
proteins, RNA transcripts or mixtures thereof which is introduced into the plant genome is preferably 
preselected and defined, e.g., from one to about 5-10 such products of the introduced DNA may be 

20 formed. 

This expression cassette or vector may be contained in a host cell. The expression cassette or 
vector may augment the genome of a transformed plant or may be maintained extrachromosomally. 
The expression cassette may be operatively linked to a structural gene, the open reading frame 
thereof, or a portion thereof. The expression cassette may further comprise a Ti plasmid and be 
25 contained in an Agrobacterium tumefaciens cell; it may be carried on a microparticle, wherein the 
microparticle is suitable for ballistic transformation of a plant cell; or it may be contained in a plant 
cell or protoplast. Further, the expression cassette or vector can be contained in a transformed plant 
or cells thereof, and the plant may be a dicot or a monocot. In particular, the plant may be a cereal 
plant. 
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Obtaining sufficient levels of transgene expression in the appropriate plant tissues is an 
important aspect in the production of genetically engineered crops. Expression of heterologous 
DNA sequences in a plant host is dependent upon the presence of an operably linked promoter that 
is functional within the plant host. Choice of the promoter sequence will determine when and where 

5 within the organism the heterologous DNA sequence is expressed. 

For example, for overexpression, a plant promoter fragment may be employed which will 
direct expression of the gene in all tissue; of a regenerated plant. Such promoters are referred to 
herein as "constitutive" promoters and are active under most environmental conditions and states of 
development or cell differentiation. Examples of constitutive promoters include the cauliflower mosaic 

10 virus (CaMV) 35S transcription initiation region, the I- or T- promoter derived from T-DNA of 
Agrobacterium tumafaciens, and other transcription initiation regions from various plant genes known 
to those of skill. Such genes include for example, the AP2 gene, ACT1 1 from Arabidopsis (Huang et 
al. Plant Mol Biol 33:125-139 (1996)), Cat3 from Arabidopsis (GenBank No. U43147, Zhong et 
al., Mol. Gen. Genet. 251:196-203 (1996)), the gene encoding stearoyl-acyl carrier protein 

15 desaturase from Brassica napus (Genbank No. X74782, Solocornbe et al. Plant Physiol. 1 04: 1 167- 
1176 (1994)), GPcl from maize (GenBank No. X15596, Martinez et al. J. Mol. Biol 208:551-565 
(1989)), and Gpc2 from maize (GenBank No. U45855, Manjunath et al., Plant Mol. Biol. 33:97- 
112(1997)). 

Alternatively, the plant promoter may direct expression of the nucleic acid molecules of the 
20 invention in a specific tissue or may be otherwise under more precise environmental or 

developmental control. Examples of environmental conditions that may effect transcription by 
inducible promoters include anaerobic conditions, elevated temperature, or the presence of light 
Such promoters are referred to here as "inducible" or "tissue-specific" promoters. One of skill will 
recognize that a tissue- specific promoter may drive expression of operably linked sequences in 
25 tissues other than the target tissue. Thus, as used herein a tissue-specific promoter is one that drives 
expression preferentially in the target tissue, but may also lead to some expression in other tissues as 
well. 

Examples of promoters under developmental control include promoters that initiate 
transcription only (or primarily only) in certain tissues, such as fruit, seeds, or flowers. Promoters that 
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direct expression of nucleic acids in ovules, flowers or seeds are particularly useful in the present 
invention. As used herein a seed- specific or preferential promoter is one which directs expression 
specifically or preferentially in seed tissues, such promoters may be, for example, ovule- specific, 
embryo- specific, endosperm- specific, integument-specific, seed coat-specific, or some combination 
thereof. Examples include a promoter from the ovule-specific BEL1 gene described in Reiser et al. 
Cell 83:735-742 (1995) (GenBank No. U39944). Other suitable seed specific promoters are 
derived from the following genes: MAC1 from maize (Sheridan et al. Genetics 142: 1 009- 1 020 
(1996), CaG from maize (GenBank No. L05934, Abler et al. Plant Mol. Biol. 22:10131-1038 
(1 993), the gene encoding oleosin 1 8 kD from maize (GenBank No, J05212, Lee et al. Plant Mol. 
Biol. 26:1981-1987 (1994)), vivparous-1 from Arabidopsis (GenbankNo. U93215), the gene 
encoding oleosin from Arabidopsis (Genbank No. Z 17657), Atmycl from Arabidopsis (Urao et al. 
Plant Mol. Biol. 32:571-576 (1996), the 2s seed storage protein gene family from Arabidopsis 
(Conceicao et al. Plant 5:493-505 (1994)) the gene encoding oleosin 20 kD from Brassica napus 
(GenBank No. M63985), nap A from Brassica napus (GenBank No. J02798, Josefsson et al. JBL 
26:12196-1301 (1987), the napin gene family from Brassica napus (Sjodahl et al. Planta 197:264- 
271 (1995), the gene encoding the 2S storage protein from Brassica napus (Dasgupta et al. Gene 
133:301-302 (1993)), the genes encoding oleosin A (Genbank No. U091 18) and oleosin B 
(Genbank No. U091 19) from soybean and the gene encoding low molecular weight sulphur rich 
protein from soybean (Choi et al. Mol Gen, Genet. 246:266-268 (1995)). 

It is specifically contemplated that one could use one of the promoters that are disclosed in 
co-pending provisional US application serial no 60/325,448, filed September 26, 2001 in unaltered 
or altered form. Especially preferred are promoters that direct transcription of an associated nucleic 
acid molecule specifically or preferentially in tissues of the plant grain such as those provided in SEQ 
IDNOs: 2275-2672. 

Mutagenization of a promoter such as those mentioned hereinbefore or those provided in 
provisional US application serial no 60/325,448 may potentially improve the utility of the elements 
for the expression of transgenes in plants. The mutagenesis of these elements can be carried out at 
random and the mutagenized promoter sequences screened for activity in a trial-by-error procedure. 
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Alternatively, particular sequences which provide the promoter with desirable expression 
characteristics, or the promoter with expression enhancement activity, could be identified and these 
or similar sequences introduced into the sequences via mutation. It is further contemplated that one 
could mutagenize these sequences in order to enhance their expression of transgenes in a particular 
5 species. 

Furthermore, it is contemplated that promoters combining elements from more than one 
promoter may be useful. For example, U.S. Patent No. 5,491,288 discloses combining a 
Cauliflower Mosaic Virus promoter with a histone promoter. Thus, the elements from the promoters 
disclosed herein may be combined with elements from other promoters. 

10 A variety of 5N and 3N transcriptional regulatory sequences are available for use in the 

present invention. Transcriptional terminators are responsible for the termination of transcription and 
correct mRNA polyadenylation. The 3N nontranslated regulatory DNA sequence preferably 
includes from about 50 to about 1 ,000, more preferably about 100 to about 1 ,000, nucleotide base 
pairs and contains plant transcriptional and translational termination sequences. Appropriate 

15 transcriptional terminators and those which are known to function in plants include the CaMV 35S 
terminator, the tml terminator, the nopaline synthase terminator, the pea rbcS E9 terminator, the 
terminator for the T7 transcript from the octopine synthase gene of Agrobacterium tumefaciens, 
and the 3N end of the protease inhibitor 1 or II genes from potato or tomato, although other 3N 
elements known to those of skill in the art can also be employed. Alternatively, one also could use a 

20 gamma coixin, oleosin 3 or other terminator from the genus Coix. 

Preferred 3N elements include those from the nopaline synthase gene of Agrobacterium 
tumefaciens (Bevan et al., 1983), the terminator for the T7 transcript from the octopine synthase 
gene of Agrobacterium tumefaciens, and the 3' end of the protease inhibitor 1 or II genes from 
potato or tomato. 

25 As the DNA sequence between the transcription initiation site and the start of the coding 

sequence, i.e., the untranslated leader sequence, can influence gene expression, one may also wish to 
employ a particular leader sequence. Preferred leader sequences are contemplated to include those 
which include sequences predicted to direct optimum expression of the attached gene, i.e., to include 
a preferred consensus leader sequence which may increase or maintain mRNA stability and prevent 
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inappropriate initiation of translation. The choice of such sequences will be known to those of skill in 
the art in light of the present disclosure. Sequences that are derived from genes that arc highly 
expressed in plants will be most preferred. 

Other sequences that have been found to enhance gene expression in transgenic plants 
5 include intron sequences (e.g., from Adhl, bronze 1 , actinl, actin 2 (WO 00/760067), or the 
sucrose synthase intron) and viral leader sequences (e.g., from TMV, MCMV and AMV). For 
example, a number of non- translated leader sequences derived from viruses are known to enhance 
expression. Specifically, leader sequences from Tobacco Mosaic Virus (TMV), Maize Chlorotic 
Mottle Virus (MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effective in 

10 enhancing expression (e.g., Gallie et al., 1987; Skuzeski et al., 1990). Other leaders known in the 
art include but are not limited to: Picornavirus leaders, for example, EMCV leader 
(Encephalomyocarditis 5 noncoding region) (Elroy- Stein et al., 1989); Potyvirus leaders, for 
example, TEV leader (Tobacco Etch Virus); MDMV leader (Maize Dwarf Mosaic Virus); Human 
immunoglobulin heavy-chain binding protein (BiP) leader, (Macejak et al., 1991); Untranslated 

15 leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4), (Jobling et al., 1987; 
Tobacco mosaic virus leader (TMV), (Gallie et al., 1989; and Maize Chlorotic Mottle Virus leader 
(MCMV) (Lommel et al., 1991. See also, Della-Cioppa et al., 1987. 

Regulatory elements such as Adh intron 1 (Callis et al., 1 987), sucrose synthase intron (Vasil 
et al, 1989) or TMV omega element (Gallie, et al., 1989), may further be included where desired. 

20 Examples of enhancers include elements from the CaMV 35S promoter, octopine synthase 

genes (Ellis el al., 1987), the rice actin I gene, the maize alcohol dehydrogenase gene (Callis et al., 
1987), the maize shrunken I gene (Vasil et al., 1989), TMV Omega element (Gallie et al., 1989) and 
promoters from non-plant eukaryotes (e.g. yeast; Ma et al., 1988). 

Two principal methods for the control of expression are known, viz.: overexpression and 
25 underexpression. Overexpression can be achieved by insertion of one or more than one extra copy 
of the selected gene. It is, however, not unknown for plants or their progeny, originally transformed 
with one or more than one extra copy of a nucleotide sequence, to exhibit the effects of 
underexpression as well as overexpression. For underexpression there are two principle methods 
which are commonly referred to in the art as "antisense downregulation" and "sense downregulation" 
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(sense downregulation is also referred to as "cosuppression"). Generically these processes are 
referred to as "gene silencing". Both of these methods lead to an inhibition of expression of the target 
gene. 

Within the scope of the present invention, the alteration in expression of the nucleic acid molecule of 
5 the present invention may be achieved in one of the following ways: 

( 1 ) "Sense" Suppression 
Alteration of the expression of a nucleotide sequence of the present invention, preferably reduction of 
its expression, is obtained by "sense" suppression (referenced in e.g. Jorgensen et al. (1996) Plant 
Mol. Biol. 3 1 , 957-973). In this case, the entirety or a portion of a nucleotide sequence of the 

10 present invention is comprised in a DNA molecule. The DNA molecule is preferably operatively 
linked to a promoter junctional in a cell comprising the target gene, preferably a plant cell, and 
introduced into the cell, in which the nucleotide sequence is expressible. The nucleotide sequence is 
inserted in the DNA molecule in the "sense orientation", meaning that the coding strand of the 
nucleotide sequence can be transcribed. In a preferred embodiment, the nucleotide sequence is fully 

15 translatable and all the genetic information comprised in the nucleotide sequence, or portion thereof, 
is translated into a polypeptide. In another preferred embodiment, the nucleotide sequence is partially 
translatable and a short peptide is translated. In a preferred embodiment, this is achieved by inserting 
at least one premature stop codon in the nucleotide sequence, which bring translation to a halt. In 
another more preferred embodiment, the nucleotide sequence is transcribed but no translation 

20 product is being made. This is usually achieved by removing the start codon, e.g. the "ATG", of the 
polypeptide encoded by the nucleotide sequence. In a further preferred embodiment, the DNA 
molecule comprising the nucleotide sequence, or a portion thereof, is stably integrated in the genome 
of the plant cell. In another preferred embodiment, the DNA molecule comprising the nucleotide 
sequence, or a portion thereof, is comprised in an extrachromosomally replicating molecule. 

25 In transgenic plants containing one of the DNA molecules described immediately above, the 

expression of the nucleotide sequence corresponding to the nucleotide sequence comprised in the 
DNA molecule is preferably reduced. Preferably, the nucleotide sequence in the DNA molecule is 
at least 70% identical to the nucleotide sequence the expression of which is reduced, more 
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preferably it is at least 80% identical, yet more preferably at least 90% identical, yet more preferably 
at least 95% identical, yet more preferably at least 99% identical. 

(2) "Anti- sense" Suppression 
In another preferred embodiment, the alteration of the expression of a nucleotide sequence of the 

5 present invention, preferably the reduction of its expression is obtained by "anti- sense" suppression. 
The entirety or a portion of a nucleotide sequence of the present invention is comprised in a DNA 
molecule. The DNA molecule is preferably operatively linked to a promoter functional in a plant cell, 
and introduced in a plant cell, in which the nucleotide sequence is expressible. The nucleotide 
sequence is inserted in the DNA molecule in the "anti- sense orientation", meaning that the reverse 

10 complement (also called sometimes non-coding strand) of the nucleotide sequence can be 

transcribed. In a preferred embodiment, the DNA molecule comprising the nucleotide sequence, or 
a portion thereof, is stably integrated in the genome of the plant cell. In another preferred 
embodiment the DNA molecule comprising the nucleotide sequence, or a portion thereof, is 
comprised in an extrachromosomally replicating molecule. Several publications describing this 

15 approach are cited for further illustration (Green, P. J. et al., Ann. Rev. Biochem. 55:569-597 

(1986); van der Krol, A. R. et al, Antisense Nuc. Acids & Proteins, pp. 125-141 (1991); Abe), P. 
P. et al., Proc. Natl. Acad. Sci. USA 86:6949-6952 (1989); Ecker, J. R. et al., Proc. Natl. Acad. 
Sci. USA 83:5372-5376 (Aug. 1986)). 

In transgenic plants containing one of the DNA molecules described immediately above, the 
20 expression of the nucleotide sequence corresponding to the nucleotide sequence comprised in the 
DNA molecule is preferably reduced. Preferably, the nucleotide sequence in the DNA molecule is 
at least 70% identical to the nucleotide sequence the expression of which is reduced, more 
preferably it is at least 80% identical, yet more preferably at least 90% identical, yet more preferably 
at least 95% identical, yet more preferably at least 99% identical. 

25 (3) Homologous Recombination 

In another preferred embodiment, at least one genomic copy corresponding to a nucleotide sequence 
of the present invention is modified in the genome of the plant by homologous recombination as 
further illustrated in Paszkowski et al., EMBO Journal 7:4021-26 (1988). This technique uses the 
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property of homologous sequences to recognize each other and to exchange nucleotide sequences 
between each by a process known in the art as homologous recombination. Homologous 
recombination can occur between the chromosomal copy of a nucleotide sequence in a cell and an 
incoming copy of the nucleotide sequence introduced in the cell by transformation. Specific 

5 modifications are thus accurately introduced in the chromosomal copy of the nucleotide sequence. In 
one embodiment, the regulatory elements of the nucleotide sequence of the present invention are 
modified. Such regulatory elements are easily obtainable by screening a genomic library using the 
nucleotide sequence of the present invention, or a portion thereof, as a probe. The existing 
regulatory elements are replaced by different regulatory elements, thus altering expression of the 

10 nucleotide sequence, or they are mutated or deleted, thus abolishing the expression of the nucleotide 
sequence. In another embodiment, the nucleotide sequence is modified by deletion of a part of the 
nucleotide sequence or the entire nucleotide sequence, or by mutation. Expression of a mutated 
polypeptide in a plant cell is also contemplated in the present invention. More recent refinements of 
this technique to disrupt endogenous plant genes have been described (Kempin et al., Nature 

15 389:802-803 (1997) and Miao and Lam, Plant J., 7:359-365 (1995). 

In another preferred embodiment, a mutation in the chromosomal copy of a nucleotide sequence is 
introduced by transforming a cell with a chimeric oligonucleotide composed of a contiguous stretch 
of RNA and DNA residues in a duplex conformation with double haiipin caps on the ends. An 
additional feature of the oligonucleotide is for example the presence of 2 , -0-methylation at the RNA 

20 residues. The RNA/DNA sequence is designed to align with the sequence of a chromosomal copy of 
a nucleotide sequence of the present invention and to contain the desired nucleotide change. For 
example, this technique is further illustrated in US patent 5,501 ,967 and Zhu et al. (1 999) Proc. 
Natl. Acad. Sci. USA 96: 8768-8773. 

(4) Ribo2ymes 

25 In a further embodiment, the RNA coding for a polypeptide of the present invention is cleaved by a 
catalytic RNA, or ribozyme, specific for such RNA. The ribozyme is expressed in transgenic plants 
and results in reduced amounts of RNA coding for the polypeptide of the present invention in plant 
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cells, thus leading to reduced amounts of polypeptide accumulated in the cells. This method is further 
illustrated in US patent 4,987,071 . 

(5) Dominant-Negative Mutants 

In another preferred embodiment, the activity of the polypeptide encoded by the nucleotide 
5 sequences of this invention is changed. This is achieved by expression of dominant negative mutants 
of the proteins in transgenic plants, leading to the loss of activity of the endogenous protein. 

(6) Aptamers 

In a further embodiment, the activity of polypeptide of the present invention is inhibited by expressing 
in transgenic plants nucleic acid ligands, so-called aptamers, which specifically bind to the protein. 

10 Aptamers are preferentially obtained by the SELEX (Systematic Evolution of Ligands by 

Exponential Enrichment) method. In the SELEX method, a candidate mixture of single stranded 
nucleic acids having regions of randomized sequence is contacted with the protein and those nucleic 
acids having an increased affinity to the target are partitioned from the remainder of the candidate 
mixture. The partitioned nucleic acids are amplified to yield a ligand enriched mixture. After several 

J 5 iterations a nucleic acid with optimal affinity to the polypeptide is obtained and is used for expression 
in transgenic plants. This method is further illustrated in US patent 5,270,163. 

(7) Zinc finger proteins 

A zinc finger protein that binds a nucleotide sequence of the present invention or to its regulatory 
region is also used to alter expression of the nucleotide sequence. Preferably, transcription of the 
20 nucleotide sequence is reduced or increased. Zinc finger proteins are for example described in 
Beerli et al. (1998) /WAS 95:14628-14633., or in WO 95/19431, WO 98/5431 1, or WO 
96/06166, all incorporated herein by reference in their entirety. 

(8) dsRNA 

Alteration of the expression of a nucleotide sequence of the present invention is also obtained by 
25 dsRNA interference as described for example in WO 99/32619, WO 99/53050 or WO 99/61631, 
all incorporated herein by reference in their entirety. 
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(9) Insertion of a DNA molecule (Insertional mutagenesis) 
In another preferred embodiment, a DNA molecule is inserted into a chromosomal copy of a 
nucleotide sequence of the present invention, or into a regulatory region thereof. Preferably, such 
DNA molecule comprises a transposable element capable of transposition in a plant cell, such as e.g. 

5 Ac/Ds, Em/Spm, mutator. Alternatively, the DNA molecule comprises a T-DN A border of an 
Agrobacterium T-DNA. The DNA molecule may also comprise a recombinase or integrase 
recognition site which can be used to remove part of the DNA molecule from the chromosome of the 
plant cell. An example of this method is set forth in Example 2. Methods of insertional mutagenesis 
using T-DNA, transposons, oligonucleotides or other methods known to those skilled in the art are 

10 also encompassed. Methods of using T-DNA and transposon for insertional mutagenesis are 

described in Winkler et al. (1989) Methods Mol. Biol. 82:129-136 and Martienssen (1998) PNAS 
95:2021-2026, incorporated herein by reference in their entireties. 

(10) Deletion mutagenesis 
In yet another embodiment, a mutation of a nucleic acid molecule of the present invention is created 
15 in the genomic copy of the sequence in the cell or plant by deletion of a portion of the nucleotide 
sequence or regulator sequence. Methods of deletion mutagenesis are known to those skilled in the 
art. See, for example, Miao et al, (1995) Plant J. 7:359. 

In yet another embodiment, this deletion is created at random in a large population of plants by 
chemical mutagenesis or irradiation and a plant with a deletion in a gene of the present invention is 

20 isolated by forward or reverse genetics. Irradiation with fast neutrons or gamma rays is known to 
cause deletion mutations in plants (Silverstone et al, (1998) Plant Cell, 10:155-169; Bruggemann et 
al., (1 996) Plant J., 1 0:755-760; Redei and Koncz in Methods in Arabidopsis Research, World 
Scientific Press (1992), pp. 16-82). Deletion mutations in a gene of the present invention can be 
recovered in a reverse genetics strategy using PCR with pooled sets of genomic DNAs as has been 

25 shown in C. elegans (Liu et al., (1999), Genome Research, 9:859-867.). A forward genetics 

strategy would involve mutagenesis of a line displaying PTGS followed by screening the M2 progeny 
for the absence of PTGS. Among these mutants would be expected to be some that disrupt a gene 
of the present invention. This could be assessed by Southern blot or PCR for a gene of the present 
invention with genomic DNA from these mutants. 
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(11) Overexpression in a plant cell 
In yet another preferred embodiment, a nucleotide sequence of the present invention encoding a 
polypeptide comprising a 3' -5' exonuclease domain and/or activity in a plant cell is overexpressed. 
Examples of nucleic acid molecules and expression cassettes for overexpression of a nucleic acid 
5 molecule of the present invention are described above. Methods known to those skilled in the art of 
over-expression of nucleic acid molecules are also encompassed by the present invention. 

In still another embodiment, the expression of the nucleotide sequence of the present 
invention is altered in every cell of a plant. This is for example obtained though homologous 
recombination or by insertion in the chromosome. This is also for example obtained by expressing a 

10 sense or antisense RNA, zinc finger protein or ribozyme under the control of a promoter capable of 
expressing the sense or antisense RNA, zinc finger protein or ribozyme in every cell of a plant 
Constitutive expression, inducible, tissue-specific or developmentally- regulated expression are also 
within the scope of the present invention and result in a constitutive, inducible, tissue-specific or 
developmentally-regulated alteration of the expression of a nucleotide sequence of the present 

15 invention in the plant cell. Constructs for expression of the sense or antisense RNA, zinc finger 
protein or ribozyme, or for overexpression of a nucleotide sequence of the present invention, are 
prepared and transformed into a plant cell according to the teachings of the present invention, e.g. as 
described infra. 

The invention hence also provides sense and anti-sense nucleic acid molecules corresponding 
20 to the open reading frames identified in the SEQ ID NOs of the Sequence Lisitng as well as their 
orthologs. . 

The genes and open reading frames according to the present invention which are substantially 
similar to a nucleotide sequence encoding a polypeptide as given in any one of the SEQ ID NOs of 
the Sequence Lisiting including any corresponding anti-sense constructs can be operably linked to 
25 any promoter that is functional within the plant host including the promoter sequences according to 
the invention or mutants thereof. 

The present invention further provides a method of augmenting a plant genome by contacting 
plant cells with a nucleic acid molecule of the invention, e.g., one having a nucleotide sequence that 
directs tissue- specific, tissue-preferential transcription of a linked nucleic acid segment isolatable or 
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obtained from a plant gene encoding a polypeptide that is substantially similar to a polypeptide 
encoded by the an Oryza gene having a sequence according to any one of SEQ ID NOs provided in 
the Sequence Listing so as to yield transformed plant cells; and regenerating the transformed plant 
cells to provide a differentiated transformed plant, wherein the differentiated transformed plant 

5 expresses the nucleic acid molecule in the cells of the plant, preferably in the appropriate tissues of 
the plant grain. The nucleic acid molecule may be present in the nucleus, chloroplast, mitochondria 
and/or plastid of the cells of the plant. 

Plant species may be transformed with the DNA construct of the present invention by the 
DN A-mediated transformation of plant cell protoplasts and subsequent regeneration of the plant 

10 from the transformed protoplasts in accordance with procedures well known in the art. 

Any plant tissue capable of subsequent clonal propagation, whether by organogenesis or 
embryogenesis, may be transformed with a vector of the present invention. The term 
"organogenesis," as used herein, means a process by which shoots and roots are developed 
sequentially from meristematic centers; the term "embryogenesis," as used herein, means a process 

15 by which shoots and roots develop together in a concerted fashion (not sequentially), whether from 
somatic cells or gametes. The particular tissue chosen will vary depending on the clonal propagation 
systems available for, and best suited to, the particular species being transformed. Exemplary tissue 
targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, 
existing meristematic tissue (e.g., apical meristems, axillary buds, and root meristems), and induced 

20 meristem tissue (e.g., cotyledon meristem and ultilane meristem). 

Plants of the present invention may take a variety of forms. The plants may be chimeras of 
transformed cells and non- transformed cells; the plants may be clonal transformants (e.g., all cells 
transformed to contain the expression cassette); the plants may comprise grafts of transformed and 
untransformed tissues (e.g., a transformed root stock grafted to an untransformed scion in citrus 

25 species). The transformed plants may be propagated by a variety of means, such as by clonal 
propagation or classical breeding techniques. For example, first generation (or Tl) transformed 
plants may be selfed to give homozygous second generation (or T2) transformed plants, and the T2 
plants further propagated through classical breeding techniques. A dominant selectable marker (such 
as npt D) can be associated with the expression cassette to assist in breeding. 
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Thus, the present invention provides a transformed (transgenic) plant cell, in ptanta or ex 
planta, including a transformed plastid or other organelle, e.g., nucleus, mitochondria or chloroplast. 
The present invention may be used for transformation of any plant species, including, but not limited 
to, cells from com (Zea mays), Brassica sp. (e.g., B. napus, B. rapa, B. juncea\ particularly those 
5 Brassica species useful as sources of seed oil, alfalfa (Medicago saliva), rice (Oryza sativa), rye 
(Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet 
(Pennisetum glaucum), proso millet (Panicum miliaceum) y foxtail millet (Setaria italica), finger 
millet (Eleusine coracana)), sunflower (Helianthus annuus), safllower (Carthamus tinctorius), 
wheat (Triticum aestivum), soybean (Glycine max), tobacco (Nicotiana tabacum), potato 

10 (Solarium tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium 
hirsutum), sweet potato (Ipomoea batatus), cassava (Manihot esculenta), coffee (Cofea spp.), 
coconut (Cocos nucifera), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa 
(Theobroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea ultilane), fig 
(Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), 

15 papaya (Carica papaya), cashew (Anacardium occidental), macadamia (Macadamia 

integrifolia), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum 
spp.), oats, duckweed (Lemna), barley, vegetables, ornamentals, and conifers. 

Duckweed (Lemna, see WO 00/07210) includes members of the family Lemnaceae. There 
are known four genera and 34 species of duckweed as follows: genus Lemna (L. aequinoctialis, L. 

20 disperma, L. ecuadoriensis, L. gibba, L.japonica, L. minor, L. miniscula, L. obscura, L. 

perpusilla, L. tenera, L. trisulca, L.turionifera, L. valdiviana); genus Spirodela (S. intermedia, 
S. polyrrhiza, S. punctata); genus Wojfia (Wa. Angusta, Wa. Arrhiza, Wa. Australina, Wa. 
Borealis, Wa. Brasiliensis, Wa. Columbiana, Wa. Elongata, Wa. Globosa, Wa. Microscopica, 
Wa. Neglecta) and genus Wofiella (Wl. ultila, WL ultilanen, Wl. gladiata, Wl. ultila, Wl. 

25 lingulata, Wl. repunda, Wl. rotunda, and Wl. neotropica). Any other genera or species of 
Lemnaceae, if they exist, are also aspects of the present invention. Lemna gibba, Lemna minor, 
and Lemna miniscula are preferred, with Lemna minor and Lemna miniscula being most 
preferred. Lemna species can be classified using the taxonomic scheme described by Landolt, 
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Biosystematic Investigation on the Family of Duckweeds: The family of Lemnaceae - A Monograph 
Study. Geobatanischen Institut ETH, Stiftung Rubel, Zurich (1986)). 

Vegetables within the scope of the invention include tomatoes (Lycopersicon esculentum), 
lettuce (e.g., Lactuca sativa), green beans (Phaseolus vulgaris), lima beans (Phaseolus limensis), 

5 peas (Lathyrus spp.), and members of the genus Cucumis such as cucumber (C. sativus), 
cantaloupe (C cantalupensis), and musk melon (C. melo). Ornamentals include azalea 
(Rhododendron spp.), hydrangea (Macrophylla hydrangea), hibiscus (Hibiscus rosasanensis), 
roses (Rosa spp.), tulips (Tulipa spp.), daffodils (Narcissus spp.), petunias (Petunia hybrida), 
carnation (Dianthus caryophyllus), poinsettia (Euphorbia pulcherrima), and chrysanthemum. 

10 Conifers that may be employed in practicing the present invention include, for example, pines such as 
loblolly pine (Pinus taeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa), 
lodgepole pine (Pinus contorta), and Monterey pine (Pinus radiata), Douglas- fir (Pseudotsuga 
menziesii)\ Western hemlock (Tsuga ultilane); Sitka spruce (Picea glauca); redwood (Sequoia 
sempervirens); true firs such as silver fir (Abies amabilis) and balsam fir (Abies balsamea); and 

15 cedars such as Western red cedar (Thuja plicata) and Alaska yellow-cedar (Chamaecyparis 
nootkatensis). Leguminous plants include beans and peas. Beans include guar, locust bean, 
fenugreek, soybean, garden beans, cowpea, mungbean, lima bean, fava bean, lentils, chickpea, etc. 
Legumes include, but are not limited to, Arachis, e.g., peanuts, Vicia, e.g., crown vetch, hairy vetch, 
adzuki bean, mung bean, and chickpea, Lupinus, e.g., lupine, trifolium, Phaseolus, e.g., common 

20 bean and lima bean, Pisum, e.g., field bean, Meliloius, e.g., clover, Medicago, e.g., alfalfa, Lotus, 
e.g., trefoil, lens, e.g., lentil, and false indigo. Preferred forage and turf grass for use in the methods 
of the invention include alfalfa, orchard grass, tall fescue, perennial ryegrass, creeping bent grass, and 
redtop. 

Other plants within the scope of the invention include Acacia, aneth, artichoke, arugula, 
25 blackberry, canola, cilantro, Clementines, escarole, eucalyptus, fennel, grapefruit, honey dew, jicama, 
Idwifiuit, lemon, lime, mushroom, nut, okra, orange, parsley, persimmon, plantain, pomegranate, 
poplar, radiata pine, radicchio, Southern pine, sweetgum, tangerine, triticale, vine, yams, apple, pear, 
quince, cherry, apricot, melon, hemp, buckwheat, grape, raspberry, chenopodium, blueberry, 
nectarine, peach, plum, strawberry, watermelon, eggplant, pepper, cauliflower, Brassica, e.g., 
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broccoli, cabbage, ultilan sprouts, onion, carrot, leek, beet, broad bean, celery, radish, pumpkin, 
endive, gourd, garlic, snapbean, spinach, squash, turnip, ultilane, and zucchini. 

Ornamental plants within the scope of the invention include impatiens, Begonia, Pelargonium, 
Viola, Cyclamen, Verbena, Vinca, Tagetes, Primula, Saint Paulia, Agertum, Amaranthus, 
5 Antihirrhinum, Aquilegia, Cineraria, Clover, Cosmo, Cowpea, Dahlia, Datura, Delphinium, Gerbera, 
Gladiolus, Gloxinia, Hippeastrum, Mesembryanthemum, Salpiglossos, and Zinnia. Other plants 
within the scope of the invention are shown in Table 1 (above). 

Preferably, transgenic plants of the present invention are crop plants and in particular cereals 
(for example, com, alfalfa, sunflower, rice, Brassica, canola, soybean, barley, soybean, sugarbeet, 
10 cotton, safflower, peanut, sorghum, wheat, millet, tobacco, etc.), and even more preferably com, 
rice and soybean. 

The present invention also provides a transgenic plant prepared by this method, a seed from 
such a plant and progeny plants from such a plant including hybrids and inbreds. Preferred 
transgenic plants are transgenic maize, soybean, barley, alfalfa, sunflower, canola, soybean, cotton, 
15 peanut, sorghum, tobacco, sugarbeet, rice, wheat, rye, turfgrass, millet, sugarcane, tomato, or 
potato. 

A transformed (transgenic) plant of the invention includes plants, the genome of which is 
augmented by a nucleic acid molecule of the invention, or in which the corresponding gene has been 
, disrupted, e.g., to result in a loss, a decrease or an alteration, in the function of the product encoded 
20 by the gene, which plant may also have increased yields and/or produce a better- quality product than 
the corresponding wild- type plant. The nucleic acid molecules of the invention are thus useful for 
targeted gene disruption, as well as markers and probes. 

The invention also provides a method of plant breeding, e.g., to prepare a crossed fertile 
transgenic plant. The method comprises crossing a fertile transgenic plant comprising a particular 
25 nucleic acid molecule of the invention with itself or with a second plant, e.g., one lacking the 

particular nucleic acid molecule, to prepare the seed of a crossed fertile transgenic plant comprising 
the particular nucleic acid molecule. The seed is then planted to obtain a crossed fertile transgenic 
plant. The plant may be a monocot or a dicot. In a particular embodiment, the plant is a cereal 
plant. 
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The crossed fertile transgenic plant may have the particular nucleic acid molecule inherited 
through a female parent or through a male parent. The second plant may be an inbred plant. The 
crossed fertile transgenic may be a hybrid. Also included within the present invention are seeds of 
any of these crossed fertile transgenic plants. 

5 Transformation of plants can be undertaken with a single DNA molecule or multiple DN A 

molecules (i.e., co-transformation), and both these techniques are suitable for use with the expression 
cassettes of the present invention. Numerous transformation vectors are available for plant 
transformation, and the expression cassettes of this invention can be used in conjunction with any 
such vectors. The selection of vector will depend upon the preferred transformation technique and 

10 the target species for transformation. 

A variety of techniques are available and known to those skilled in the art for introduction of 
constructs into a plant cell host. These techniques generally include transformation with DNA 
employing A. tumefaciens or A. rhizogenes as the transforming agent, liposomes, PEG 
precipitation, electroporation, DNA injection, direct DNA uptake, microprojectile bombardment, 

15 particle acceleration, and the like (See, for example, EP 295959 and EP 138341) (see below). 
However, cells other than plant cells may be transformed with the expression cassettes of the 
invention. The general descriptions of plant expression vectors and reporter genes, and 
Agrobacterium and Agrobacterium-mediattd gene transfer, can be found in Gruber et al. (1993). 
Expression vectors containing genomic or synthetic fragments can be introduced into 

20 protoplasts or into intact tissues or isolated cells. Preferably expression vectors are introduced into 
intact tissue. General methods of culturing plant tissues are provided for example by Maki et al., 
(1993); and by Phillips et al. (1988). Preferably, expression vectors are introduced into maize or 
other plant tissues using a direct gene transfer method such as microprojectile-mediated delivery, 
DNA injection, electroporation and the like. More preferably expression vectors are introduced into 

25 plant tissues using the microprojectile media delivery with the biolistic device. See, for example, 
Tomes et al. (1995). The vectors of the invention can not only be used for expression of structural 
genes but may also be used in exon-trap cloning, or promoter trap procedures to detect differential 
gene expression in varieties of tissues, (Lindsey et al., 1993; Auch & Reth et al.). 
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It is particularly preferred to use the binary type vectors of Ti and Ri plasmids of 
Agrobacterium spp. Ti- derived vectors transform a wide variety of higher plants, including 
monocotyledonous and dicotyledonous plants, such as soybean, cotton, rape, tobacco, and rice 
(Pacciotti et al., 1985: Byrne et al., 1987; Sukhapinda et al., 1987; Lorz et al., 1985; Potrykus, 
1985; Park et al., 1985: Hiei et al., 1994). The use of T-DNA to transform plant cells has received 
extensive study and is amply described (EP 120516; Hoekema, 1985; Knauf, et al., 1983; and An 
et al., 1985). For introduction into plants, the chimeric genes of the invention can be inserted into 
binary vectors as described in the examples. 

Other transformation methods are available to those skilled in the art, such as direct uptake 
of foreign DNA constructs (see EP 295959), techniques of electroporation (Fromm et al., 1986) or 
high velocity ballistic bombardment with metal particles coated with the nucleic acid constructs (Kine 
et al., 1987, and U.S. Patent No. 4,945,050). Once transformed, the cells can be regenerated by 
those skilled in the art. Of particular relevance are the recently described methods to transform 
foreign genes into commercially important crops, such as rapeseed (De Block et al., 1989), 
sunflower (Everett et al., 1987), soybean (McCabe et al., 1988; Hinchee et al., 1988; Chee et al., 
1989; Christou et al., 1989; EP 301749), rice (Hiei et al., 1994), and com (Gordon Kamm et al., 
1990; Fromm etal., 1990). 

Those skilled in the art will appreciate that the choice of method might depend on the type of 
plant, i.e., monocotyledonous or dicotyledonous, targeted for transformation. Suitable methods of 
transforming plant cells include, but are not limited to, microinjection (Crossway et al., 1 986), 
electroporation (Riggs et al., 1986), Agrobacterium-mediated transformation (Hinchee et al., 
1988), direct gene transfer (Paszkowski et al., 1984), and ballistic particle acceleration using devices 
available from Agracetus, Inc., Madison, Wis. And BioRad, Hercules, Calif, (see, for example, 
Sanford et al., U.S. Pat. No. 4,945,050; and McCabe et al., 1988). Also see, Weissinger et al., 
1988; Sanford et al., 1987 (onion); Christou et al., 1988 (soybean); McCabe et al., 1988 
(soybean); Darta et al., 1990 (rice); Klein et al., 1988 (maize); Klein et al., 1988 (maize); Klein et 
al., 1988 (maize); Fromm et al., 1990 (maize); and Gordon-Kamm et al, 1990 (maize); Svab et al., 
1990 (tobacco chloroplast); Koziel et al., 1993 (maize); Shimamoto et al., 1989 (rice); Christou et 
al., 1991 (rice); European Patent Application EP 0 332 581 (orchardgrass and other Pooideae); 
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Vasil et al., 1993 (wheat); Weeks et al., 1993 (wheat). In one embodiment, the protoplast 
transformation method for maize is employed (European Patent Application EP 0 292 435, U. S. 
Pat. No. 5,350,689). 

In another embodiment, a nucleotide sequence of the present invention is directly transformed 
5 into the plastid genome. Plastid transformation technology is extensively described in U.S. Patent 
Nos. 5,451,513, 5,545,817, and 5,545,818, in PCT application no. WO 95/16783, and in 
McBride et al., 1994. The basic technique for chloroplast transformation involves introducing regions 
of cloned plastid DNA flanking a selectable marker together with the gene of interest into a suitable 
target tissue, e.g., using biolistics or protoplast transformation (e.g., calcium chloride or PEG 

10 mediated transformation). The 1 to 1.5 kb flanking regions, termed targeting sequences, facilitate 
orthologous recombination with the plastid genome and thus allow the replacement or modification of 
specific regions of the plastome. Initially, point mutations in the chloroplast 16S rRNA and rpsl2 
genes conferring resistance to spectinomycin and/or streptomycin are utilized as selectable markers 
for transformation (Svab et al., 1990; Staub et al., 1992). This resulted in stable homoplasmic 

15 transformants at a frequency of approximately one per 100 bombardments of target leaves. The 
presence of cloning sites between these markers allowed creation of a plastid targeting vector for 
introduction of foreign genes (Staub et al., 1993). Substantial increases in transformation frequency 
are obtained by replacement of the recessive rRNA or r-protein antibiotic resistance genes with a 
dominant selectable marker, the bacterial aadA gene encoding the spectinomycin-detoxifying enzyme 

20 aminoglycoside-3N-adenyltransferase (Svab et al., 1993). Other selectable markers useful for 
plastid transformation are known in the art and encompassed within the scope of the invention. 
Typically, approximately 15-20 cell division cycles following transformation are required to reach a 
homoplastidic state. Plastid expression, in which genes are inserted by orthologous recombination 
into all of the several thousand copies of the circular plastid genome present in each plant cell, takes 

25 advantage of the enormous copy number advantage over nuclear- expressed genes to permit 
expression levels that can readily exceed 10% of the total soluble plant protein. In a preferred 
embodiment, a nucleotide sequence of the present invention is inserted into a plastid targeting vector 
and transformed into the plastid genome of a desired plant host. Plants homoplastic for plastid 
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genomes containing a nucleotide sequence of the present invention are obtained, and are 
preferentially capable of high expression of the nucleotide sequence. 

Agrobacterium tumefaciens cells containing a vector comprising an expression cassette of 
the present invention, wherein the vector comprises a Ti plasmid, are useful in methods of making 
5 transformed plants. Plant cells are infected with an Agrobacterium tumefaciens as described 

above to produce a transformed plant cell, and then a plant is regenerated from the transformed plant 
cell. Numerous Agrobacterium vector systems useful in carrying out the present invention are 
known. 

For example, vectors are available for transformation using Agrobacterium tumefaciens. 

10 These typically carry at least one T-DNA border sequence and include vectors such as pBINl 9 
(Bevan, 1 984). In one preferred embodiment, the expression cassettes of the present invention may 
be inserted into either of the binary vectors pCIB200 and pCIB2001 for use with Agrobacterium. 
These vector cassettes for Agrobacterium-mcdiated transformation wear constructed in the 
following manner. PTJS75kan was created by Narl digestion of pTJS75 (Schmidhauser & Helinski, 

15 1985) allowing excision of the tetracycline- resistance gene, followed by insertion of an AccI 

fragment from pUC4K carrying an NPTD (Messing & Vierra, 1982; Bevan et al., 1983; McBride et 
al., 1990). Xhol linkers were ligated to the EcoRV fragment of pCIB7 which contains the left and 
right T-DNA borders, a plant selectable nos/nptll chimeric gene and the pUC polylinker (Rothstein 
et al., 1987), and the Xhol- digested fragment was cloned into Sall-digested pTJS75kan to create 

20 pCIB200 (see also EP 0 332 104, example 19). PCIB200 contains the following unique polylinker 
restriction sites: EcoRI, SstI, Kpnl, Bglll, Xbal, and Sail. The plasmid pCIB2001 is a derivative of 
pCIB200 which was created by the insertion into the polylinker of additional restriction sites. 
Unique restriction sites in the polylinker of pdB2001 are EcoR], SstI, Kpnl, Bglll, Xbal, Sail, 
Mlul, Bell, Avrll, Apal, Hpal, and StuT. PCIB2001, in addition to containing these unique 

25 restriction sites also has plant and bacterial kanamycin selection, left and right T-DNA borders for 
Agrobacterium-medialed transformation, the RK2-derived trfA function for mobilization between 
E. coli and other hosts, and the OriT and OriV functions also from RK2. The pCIB2001 polylinker 
is suitable for the cloning of plant expression cassettes containing their own regulatory signals. 
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An additional vector useful for Agrobacterium- mediated transformation is the binary vector 
pCIB 10, which contains a gene encoding kanamycin resistance for selection in plants, T-DNA right 
and left border sequences and incorporates sequences from the wide host- range plasmid pRK252 
allowing it to replicate in both E. coli and Agrobacterium. Its construction is described by 
5 Rothstein et al., 1987. Various derivatives of pGBlO have been constructed which incorporate the 
gene for hygromycin B phosphotransferase described by Gritz et al., 1983. These derivatives enable 
selection of transgenic plant cells on hygromycin only (pCIB743), or hygromycin and kanamycin 
(pCIB715, pCIB717). 

Methods using either a form of direct gene transfer or Agrobacterium-mediated transfer 
10 usually, but not necessarily, are undertaken with a selectable marker which may provide resistance to 
an antibiotic (e.g., kanamycin, hygromycin or methotrexate) or a herbicide (e.g., phosphinothricin). 
The choice of selectable marker for plant transformation is not, however, critical to the invention. 

For certain plant species, different antibiotic or herbicide selection markers may be 
preferred. Selection markers used routinely in transformation include the nptll gene which confers 
15 resistance to kanamycin and related antibiotics (Messing & Vierra, 1982; Bevan et al., 1983), the 
bar gene which confers resistance to the herbicide phosphinothricin (White et al., 1 990, Spencer et 
aL, 1990), the hph gene which confers resistance to the antibiotic hygromycin (Blochinger & 
Diggelmann), and the dhfr gene, which confers resistance to methotrexate (Bourouis et al., 1 983). 

Selection markers resulting in positive selection, such as a phosphomannose isomerase gene, 
20 as described in patent application WO 93/05163, are also used. Other genes to be used for positive 
selection are described in WO 94/20627 and encode xyloisomerases and phosphomanno- 
isomerases such as mannose- 6- phosphate isomerase and mannose- 1 -phosphate isomerase; 
phosphomanno mutase; mannose epimerases such as those which convert carbohydrates to mannose 
or mannose to carbohydrates such as glucose or galactose; phosphatases such as mannose or xylose 
25 phosphatase, mannose-6-phosphatase and mannose- 1 -phosphatase, and permeases which are 
involved in the transport of mannose, or a derivative, or a precursor thereof into the cell. The agent 
which reduces the toxicity of the compound to the cells is typically a glucose derivative such as 
methyl- 3 - O -glucose or phloridzin. Transformed cells are identified without damaging or killing the 
non- transformed cells in the population and without co- introduction of antibiotic or herbicide 
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resistance genes. As described in WO 93/05163, in addition to the fact that the need for antibiotic or 
herbicide resistance genes is eliminated, it has been shown that the positive selection method is often 
far more efficient than traditional negative selection. 

One vector useful for direct gene transfer techniques in combination with selection by the 

5 herbicide Basta (or phosphinothricin) is pQB3064. This vector is based on the plasmid pCIB246, 
which comprises the CaMV 35S promoter in operational fusion to the E. coli GUS gene and the 
CaMV 35S transcriptional terminator and is described in the PCT published application WO 
93/07278, herein incorporated by reference. One gene useful for conferring resistance to 
phosphinothricin is the bar gene from Streptomyces viridochromogenes (Thompson et al., 1987). 

10 This vector is suitable for the cloning of plant expression cassettes containing their own regulatory 
signals. 

An additional transformation vector is pSOG35 which utilizes the E, coli gene dihydrofolate 
reductase (DHFR) as a selectable marker conferring resistance to methotrexate. PCR was used to 
amplify the 35S promoter (about 800 bp), intron 6 from the maize Adhl gene (about 550 bp) and 

15 1 8 bp of the GUS untranslated leader sequence from pSOGlO. A 250 bp fragment encoding the E. 
coli dihydrofolate reductase type II gene was also amplified by PCR and these two PCR fragments 
were assembled with a SacI-PstI fragment from pB1221 (Clontech) which comprised the pUC19 
vector backbone and the nopaline synthase terminator. Assembly of these fragments generated 
pSOG19 which contains the 35S promoter in fusion with the intron 6 sequence, the GUS leader, the 

20 DHFR gene and the nopaline synthase terminator. Replacement of the GUS leader in pSOG19 with 
the leader sequence from Maize Chlorotic Mottle Virus check (MCMV) generated the vector 
pSOG35. pSOGl 9 and pSOG35 carry the pUC-derived gene for ampicillin resistance and have 
Hindlll, SphI, Pstl and EcoRI sites available for the cloning of foreign sequences. 

Binary backbone vector pNOV21 17 contains the T-DNA portion flanked by the right and 

25 left border sequences, and including the Positech™ (Syngenta) plant selectable marker and the 
"grain filling candidate gene" gene expression cassette. The Positech™ plant selectable marker 
confers resistance to mannose and in this instance consists of the maize ubiquitin promoter driving 
expression of the PMI (phosphomannose isomerase) gene, followed by the cauliflower mosaic virus 
transcriptional terminator. 



- 106- 



WO 03/000905 



PCMB02/02450 



Transgenic plant cells are then placed in an appropriate selective medium for selection of 
transgenic cells which are then grown to callus. Shoots are grown from callus and plantlets 
generated from the shoot by growing in rooting medium. The various constructs normally will be 
joined to a marker for selection in plant cells. Conveniently, the marker may be resistance to a 

5 biocide (particularly an antibiotic, such as kanamycin, G4 1 8, bleomycin, hygromycin, 

chloramphenicol, herbicide, or the like). The particular marker used will allow for selection of 
transformed cells as compared to cells lacking the DNA which has been introduced. Components of 
DNA constructs including transcription cassettes of this invention may be prepared from sequences 
which are native (endogenous) or foreign (exogenous) to the host. By "foreign" it is meant that the 

10 sequence is not found in the wild-type host into which the construct is introduced. Heterologous 
constructs will contain at least one region which is not native to the gene from which the 
transcription- initiation- region is derived. 

To confirm the presence of the transgenes in transgenic cells and plants, a variety of assays 
may be performed. Such assays include, for example, "molecular biological" assays well known to 

15 those of skill in the art, such as Southern and Northern blotting, in situ hybridization and nucleic 

acid-based amplification methods such as PCR or RT-PCR; "biochemical" assays, such as detecting 
the presence of a protein product, e.g., by immunological means (ELISAs and Western blots) or by 
enzymatic function; plant part assays, such as seed assays; and also, by analyzing the phenotype of 
the whole regenerated plant, e.g., for disease or pest resistance. 

20 DNA may be isolated from cell lines or any plant parts to determine the presence of the 

preselected nucleic acid segment through the use of techniques well known to those skilled in the art. 
Note that intact sequences will not always be present, presumably due to rearrangement or deletion 
of sequences in the cell. 

The presence of nucleic acid elements introduced through the methods of this invention may be 
25 determined by polymerase chain reaction (PCR). Using this technique discreet fragments of nucleic 
acid are amplified and detected by gel electrophoresis. This type of analysis permits one to 
determine whether a preselected nucleic acid segment is present in a stable transformant, but does 
not prove integration of the introduced preselected nucleic acid segment into the host cell genome. 
In addition, it is not possible using PCR techniques to determine whether transformants have 
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exogenous genes introduced into different sites in the genome, i.e., whether transformants are of 
independent origin. It is contemplated that using PCR techniques it would be possible to clone 
fragments of the host genomic DNA adjacent to an introduced preselected DNA segment. 

Positive proof of DNA integration into the host genome and the independent identities of 
5 transformants may be determined using the technique of Southern hybridization. Using this technique 
specific DNA sequences that were introduced into the host genome and flanking host DNA 
sequences can be identified. Hence the Southern hybridization pattern of a given transformant serves 
as an identifying characteristic of that transformant. In addition it is possible through Southern 
hybridization to demonstrate the presence of introduced preselected DNA segments in high 

10 molecular weight DNA, i.e., confirm that the introduced preselected DNA segment has been 

integrated into the host cell genome. The technique of Southern hybridization provides information 
that is obtained using PCR, e.g., the presence of a preselected DNA segment, but also demonstrates 
integration into the genome and characterizes each individual transformant. 

It is contemplated that using the techniques of dot or slot blot hybridization which are 

15 modifications of Southern hybridization techniques one could obtain the same information that is 
derived from PCR, e.g., the presence of a preselected DNA segment. 

Both PCR and Southern hybridization techniques can be used to demonstrate transmission of 
a preselected DNA segment to progeny. In most instances the characteristic Southern hybridization 
pattern for a given transformant will segregate in progeny as one or more Mendelian genes (Spencer 

20 et al., 1992); Laursen et al., 1994) indicating stable inheritance of the gene. The nonchimeric nature 
of the callus and the parental transformants (Ro) was suggested by germline transmission and the 
identical Southern blot hybridization patterns and intensities of the transforming DNA in callus, Ro 
plants and R| progeny that segregated for the transformed gene. 

Whereas DNA analysis techniques may be conducted using DNA isolated from any part of a 

25 plant, RNA may only be expressed in particular cells or tissue types and hence it will be necessary to 
prepare RNA for analysis from these tissues. PCR techniques may also be used for detection and 
quantitation of RNA produced from introduced preselected DNA segments. In this application of 
PCR it is first necessary to reverse transcribe RNA into DNA, using enzymes such as reverse 
transcriptase, and then through the use of conventional PCR techniques amplify the DNA. In most 



- 108- 



WO 03/000905 



PCT/IB02/02450 



instances PCR techniques, while useful, will not demonstrate integrity of the RNA product. Further 
information about the nature of the RNA product may be obtained by Northern blotting. This 
technique will demonstrate the presence of an RNA species and give information about the integrity 
of that RNA. The presence or absence of an RNA species can also be determined using dot or slot 
5 blot Northern hybridizations. These techniques are modifications of Northern blotting and will only 
demonstrate the presence or absence of an RNA species. 

While Southern blotting and PCR may be used to detect the preselected DNA segment in 
question, they do not provide information as to whether the preselected DNA segment is being 
expressed. Expression may be evaluated by specifically identifying the protein products of the 
10 introduced preselected DNA segments or evaluating the phenotypic changes brought about by their 
expression. 

Assays for the production and identification of specific proteins may make use of physical- 
chemical, structural, functional, or other properties of the proteins. Unique physical-chemical or 
structural properties allow the proteins to be separated and identified by electrophoretic procedures, 

15 such as native or denaturing gel electrophoresis or isoelectric focusing, or by chromatographic 
techniques such as ion exchange or gel exclusion chromatography. The unique structures of 
individual proteins offer opportunities for use of specific antibodies to detect their presence in formats 
such as an ELISA assay. Combinations of approaches may be employed with even greater 
specificity such as Western blotting in which antibodies are used to locate individual gene products 

20 that have been separated by electrophoretic techniques. Additional techniques may be employed to 
absolutely confirm the identity of the product of interest such as evaluation by amino acid sequencing 
following purification. Although these are among the most commonly employed, other procedures 
may be additionally used. 

Assay procedures may also be used to identify the expression of proteins by their functionality, 

25 especially the ability of enzymes to catalyze specific chemical reactions involving specific substrates 
and products. These reactions may be followed by providing and quantifying the loss of substrates 
or the generation of products of the reactions by physical or chemical procedures. Examples are as 
varied as the enzyme to be analyzed. 
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Very frequently the expression of a gene product is determined by evaluating the 
phenotypic results of its expression. These assays also may take many forms including but not 
limited to analyzing changes in the chemical composition, morphology, or physiological properties of 
the plant Morphological changes may include greater stature or thicker stalks. Most often changes 
5 in response of plants or plant parts to imposed treatments are evaluated under carefully controlled 
conditions termed bioassays. 

The compositions of the invention include plant nucleic acid molecules, and the amino acid 
sequences for the polypeptides or partiaHength polypeptides encoded by the nucleic acid molecule 
which comprises an open reading ftame. These sequences can be employed to alter expression of a 

10 particular gene corresponding to the open reading frame by decreasing or eliminating expression of 
that plant gene or by overexpressing a particular gene product. Methods of this embodiment of the 
invention include stably tnuisforming a plant with the nucleic acid molecule of the invention which 
includes an open reading frame operably linked to a promoter capable of driving expression of that 
open reading frame (sense or antisense) in a plant cell. By "portion" or "fragment", as it relates to a 

15 nucleic acid molecule which comprises an open reading frame or a fragment thereof encoding a 
partial- length polypeptide having the activity of the full length polypeptide, is meant a sequence 
having at least 80 nucleotides, more preferably at least 150 nucleotides, and still more preferably at 
least 400 nucleotides. If not employed for expressing, a "portion" or "fragment" means at least 9, 
preferably 12, more preferably 15, even more preferably at least 20, consecutive nucleotides, e.g., 

20 probes and primers (oligonucleotides), corresponding to the nucleotide sequence of the nucleic acid 
molecules of the invention. Thus, to express a particular gene product, the method comprises 
introducing to a plant, plant cell, or plant tissue an expression cassette comprising a promoter linked 
to an open reading frame so as to yield a transformed differentiated plant, transformed cell or 
transformed tissue. Transformed cells or tissue can be regenerated to provide a transformed 

25 differentiated plant. The transformed differentiated plant or cells thereof preferably expresses the 
open reading frame in an amount that alters the amount of the gene product in the plant or cells 
thereof, which product is encoded by the open reading frame. The present invention also provides a 
transformed plant prepared by the method, progeny and seed thereof. 
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The invention further includes a nucleotide sequence which is complementary to one 
(hereinafter "test" sequence) which hybridizes under stringent conditions with a nucleic acid molecule 
of the invention as well as RNA which is transcribed from the nucleic acid molecule. When the 
hybridization is performed under stringent conditions, either the test or nucleic acid molecule of 
5 invention is preferably supported, e.g., on a membrane or DNA chip. Thus, either a denatured test 
or nucleic acid molecule of the invention is preferably first bound to a support and hybridization is 
effected for a specified period of time at a temperature of, e.g., between 55 and 70°C, in double 
strength citrate buffered saline (SQ containing 0.1% SDS followed by rinsing of the support at the 
same temperature but with a buffer having a reduced SC concentration. Depending upon the degree 

10 of stringency required such reduced concentration buffers are typically single strength SC containing 
0.1% SDS, half strength SC containing 0.1% SDS and one- tenth strength SC containing 0.1% SDS. 

In a further embodiment, the present invention provides a transformed plant host cell, or one 
obtained through breeding, capable of over-expressing, under- expressing, or having a knock out of 
amino acid genes and/or their gene products. The plant cell is transformed with at least one such 

15 expression vector wherein the plant host cell can be used to regenerate plant tissue or an entire plant, 
or seed there from, in which the effects of expression, including overexpression or underexpression, 
of the introduced sequence or sequences can be measured in vitro or in planta. 

Polynucleotides derived from the nucleic acid molecules of the present invention having any 
of the nucleotide sequences of SEQ ID NO: 1 to 461 and 501 to 51 1, respectively, encoding a 

20 polypeptide the expression of which is up- regulated during grain filling, are useful to detect the 
presence in a test sample of at least one copy of a nucleotide sequence containing the same or 
substantially the same sequence, or a fragment, complement, or variant thereof. The sequence of the 
probes and/or primers of the instant invention need not be identical to those provided in the 
Sequence Listing or the complements thereof Some variation in probe or primer sequence and/or 

25 length can allow additional family members to be detected, as well as orthologous genes and more 
taxonomically distant related sequences. Similarly probes and/or primers of the invention can include 
additional nucleotides that serve as a label for detecting duplexes, for isolation of duplexed 
polynucleotides, or for cloning purposes. 
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Preferred probes and primers of the invention include isolated, purified, or recombinant 
polynucleotides containing a contiguous span of between at least 12 to at least 1000 nucleotides of 
any nucleotid sequence which is substantially similar, and preferably has at least between 70% and 
99% sequence identity to any one of SEQ ID NO: 1 to 461, 501-51 1, and 513-641, respectively, 

5 encoding a polypeptide the expression of which is up- regulated during grain filling, or the 

complements thereof, with each individual number of nucleotides within this range also being part of 
the invention. Preferred are isolated, purified, or recombinant polynucleotides containing a contiguous 
span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, 400, 500, 
750, or 1000 nucleotides of any nucleotide sequence which is substantially similar, and preferably 

10 has at least between 70% and 99%, sequence identity to any one of SEQ ID NO: 1 to 461 , 501 - 
511, and 5 1 3-64 1 , respectively, encoding a polypeptide the expression of which is up- regulated 
during grain filling, or the complements thereof The appropriate length for primers and probes will 
vary depending on the application. For use as PCR primers, probes are 12-40 nucleotides, 
preferably 1 8-30 nucleotides long. For use in mapping, probes are 50 to 500 nucleotides, 

15 preferably 100-250 nucleotides long. For use in Southern hybridizations, probes as long as several 
kilobases can be used. The appropriate length for primers and probes under a particular set of assay 
conditions may be empirically determined by one of skill in the art. 

The primers and probes can be prepared by any suitable method, including, for example, 
cloning and restriction of appropriate sequences and direct chemical synthesis by a method such as 

20 the phosphodiester method of Narang el al (Meth Enzymol 68: 90 ( 1 979)), the 

diethylphosphoramidite method, the triester method of Matteucci etal (J Am ChemSoc 103: 3185 
(1981)), or according to Urdea et al (Proc Natl Acad 80: 7461 (1981)), the solid support method 
described in EP 0 707 592, or using commercially available automated oligonucleotide synthesizers. 
Detection probes are generally nucleotide sequences or uncharged nucleotide analogs such as, 

25 for example peptide nucleotides which are disclosed in International Patent Application WO 

92/20702, morpholino analogs which are described in U.S. Patent Nos. 5,185,444, 5,034,506 and 
5,142,047. The probe may have to be rendered "non- extendable" such that additional dNTPs 
cannot be added to the probe. Analogs are usually non-extendable, and nucleotide probes can be 
rendered non-extendable by modifying the 3' end of the probe such that the hydroxyl group is no 
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longer capable of participating in elongation. For example, the 3' end of the probe can be 
functionalized with the capture or detection label to thereby consume or otherwise block the 
hydroxyl group. Alternatively, the 3' hydroxyl group simply can be cleaved, replaced or modified so 
as to render the probe non-extendable. 

5 Any of the polynucleotides of the present invention can be labeled, if desired, by incorporating 

a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical 
means. For example, useful labels include radioactive substances ( 32 P, 35 S, 3 H, ,25 I), fluorescent 
dyes (5-bromodesoxyuridine, fluorescein, acetylaminofluorene, digoxigenin) or biotin. Preferably, 
polynucleotides are labeled at their 3' and 5' ends. Examples of non- radioactive labeling of 

10 nucleotide fragments are described in the French patent No. FR-78 1 0975 and by Urdea et ai (Nuc 
Acids Res 1 6:4937 (1 988)). In addition, the probes according to the present invention may have 
structural characteristics such that they allow the signal amplification, such structural characteristics 
being, for example, branched DNA probes as described in EP 0 225 807. 

A label can also be used to capture the primer so as to facilitate the immobilization of either the 

15 primer or a primer extension product, such as amplified DNA, on a solid support. A capture label is 
attached to the primers or probes and can be a specific binding member that forms a binding pair 
with the solid's phase reagent's specific binding member, for example biotin and streptavidin. 
Therefore depending upon the type of label carried by a polynucleotide or a probe, it may be 
employed to capture or to detect the target DNA. Further, it will be understood that the 

20 polynucleotides, primers or probes provided herein, may, themselves, serve as the capture label. 
For example, in the case where a solid phase reagent's binding member is a nucleotide sequence, it 
may be selected such that it binds a complementary portion of a primer or probe to thereby 
immobilize the primer or probe to the solid phase. In cases where a polynucleotide probe itself 
serves as the binding member, those skilled in the art will recognize that the probe will contain a 

25 sequence or "tail" that is not complementary to the target. In the case where a polynucleotide primer 
itself serves as the capture label, at least a portion of the primer will be free to hybridize with a 
nucleotide on a solid phase. DNA labeling techniques are well known in the art. 

Any of the polynucleotides, primers and probes of the present invention can be conveniently 
immobilized on a solid support. Solid supports are known to those skilled in the art and include the 



- 113- 



WO 03/000905 



PC17IB02/02450 



walls of wells of a reaction tray, test tubes, polystyrene beads, magnetic beads, nitrocellulose strips, 
membranes, microparticles such as latex particles, sheep (or other animal) red blood cells, duracytes 
and others. The solid support is not critical and can be selected by one skilled in the art. Thus, latex 
particles, microparticles, magnetic or non-magnetic beads, membranes, plastic tubes, walls of 
5 microliter wells, glass or silicon chips, sheep (or other suitable animaPs) red blood cells and 

duracytes are all suitable examples. Suitable methods for immobilizing nucleotides on solid phases 
include ionic, hydrophobic, covalent interactions and the like. A solid support, as used herein, refers 
to any material that is insoluble, or can be made insoluble by a subsequent reaction. The solid 
support can be chosen for its intrinsic ability to attract and immobilize the capture reagent. 

10 Alternatively, the solid phase can retain an additional receptor that has the ability to attract and 
immobilize the capture reagent. The additional receptor can include a charged substance that is 
oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to 
the capture reagent. As yet another alternative, the receptor molecule can be any specific binding 
member which is immobilized upon (attached to) the solid support and which has the ability to 

15 immobilize the capture reagent through a specific binding reaction. The receptor molecule enables 
the indirect binding of the capture reagent to a solid support material before the performance of the 
assay or during the performance of the assay. The solid phase thus can be a plastic, derivatized 
plastic, magnetic or non- magnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, 
bead, microparticle, chip, sheep (or other suitable animal's) red blood cells, duracytes and other 

20 configurations known to those of ordinary skill in the art. The polynucleotides of the invention can be 
attached to or immobilized on a solid support individually or in groups of at least 2,5,8, 10, 12, 15, 
20, or 25 distinct polynucleotides of the invention to a single solid support. In addition, 
polynucleotides other than those of the invention may be attached to the same solid support as one 
or more polynucleotides of the invention. 

25 The polynucleotides of the invention that are expressed or repressed in response to 

environmental stimuli such as, for example, biotic or abiotic stress or treatment with chemicals or 
pathogens or at different developmental stages can be identified by employing an array of nucleic 
acid samples, e.g., each sample having a plurality of oligonucleotides, and each plurality 
corresponding to a different plant gene, on a solid substrate, e.g., a DNA chip, and probes 
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corresponding to nucleic acid expressed in, for example, one or more plant tissues and/or at one or 
more developmental stages, e.g., probes corresponding to nucleic acid expressed in seed of a plant 
relative to control nucleic acid from sources other than seed. Thus, genes that are upregulated or 
downregulated in the majority of tissues at a majority of developmental stages, or upregulated or 
5 downregulated in one tissue such as in seed, can be systematically identified. The probes may also 
correspond to nucleic acid expressed in respone to a defined treatment such as, for example, a 
treatment with a variety of plant hormones or the exposure to specific environmental conditions 
involving, for example, an abiotic stress or exposure to light. 

Specifically, labeled rice cRNA probes were hybridized to the rice DNA array, expression 
10 levels were determined by laser scanning and then rice genes were identified that had a particular 
expression pattern. The rice oligonucleotide probe array consists of probes from over 18,000 
unique rice genes, which covers approximately 40-50% of the genome. This genome array permits a 
broader, more complete and less biased analysis of gene expression. 

As described herein, GeneChip® technology was utilized to discover rice genes that are 
15 preferentially (or exclusively) expressed during the grain filling process in specific tissues of the plant 
grain such as, for example, the aleurone, embryo, endosperm, seed coat, etc. 

Using this approach, 461 genes were identified, the expression of which was specifically 
elevated during the grain filling process.. 

Consequently, the invention also deals with a method for detecting the presence of a 
20 polynucleotide including a nucleotide sequence which is substantially similar, and preferably has at 
least between 70% and 99% sequence identity to any one of SEQ ID NO: 1 to 461 , 501 -5 1 1 , and 
5 1 3-641 , respectively, encoding a polypeptide the expression of which is up-regulated during grain 
filling, or a fragment or a variant thereof, or a complementary sequence thereto in a sample, the 
method including the following steps of: 
25 (a) bringing into contact a nucleotide probe or a plurality of nucleotide probes which 

can hybridize with polynucleotide having a nucleotide sequence which is substantially similar, 
and preferably has at least between 70% and 99% sequence identity to any one of SEQ ID 
NO: 1 to 46 1 , 50 1 -5 1 1 , and 5 1 3-64 1 , respectively, encoding a polypeptide the expression 
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of which is up- regulated during grain filling, or a fragment or a variant thereof, or a 
complementary sequence thereto and the sample to be assayed. 

(b) detecting the hybrid complex formed between the probe and a nucleotide in the 

sample. 

The invention further concerns a kit for detecting the presence of a polynucleotide including a 
nucleotide sequence which is substantially similar, and preferably has at least between 70% and 99% 
sequence identity to any one of SEQ ID NO: 1 to 461, 501-51 1, and 513-641, respectively, 
encoding a polypeptide the expression of which is up- regulated during grain filling, or a fragment or a 
variant thereof, or a complementary sequence thereto in a sample, the kit including a nucleotide 
probe or a plurality of nucleotide probes which can hybridize with a nucleotide sequence included in 
a polynucleotide including a nucleotide sequence which is substantially similar, and preferably has at 
least between 70% and 99% sequence identity to any one of SEQ ID NO: I to 461, 501-51 1, and 
513-641, respectively, encoding a polypeptide the expression of which is up-regulated during grain 
filling, or a fragment or a variant thereof, or a complementary sequence thereto and, optionally, the 
reagents necessary for performing the hybridization reaction. 

In a first preferred embodiment of this detection method and kit, the nucleotide probe or the 
plurality of nucleotide probes are labeled with a detectable molecule. In a second preferred 
embodiment of the method and kit, the nucleotide probe or the plurality of nucleotide probes has 
been immobilized on a substrate. 

The isolated polynucleotides of the invention can be used to create various types of genetic 
and physical maps of the genome of rice or other plants. Such maps are used to devise positional 
cloning strategies for isolating novel genes from the mapped crop species. The sequences of the 
present invention are also useful for chromosome mapping, chromosome identification, tagging of 
genes that are involved in the grain filling process. 

The isolated polynucleotides of the invention can further be used as probes for identifying 
polymorphisms associated with phenotypes of interest such as, for example, enhanced phosphate 
utilization, and higher yield. Briefly, total DNA is isolated from an individual or isogenic line, cleaved 
with one or more restriction enzymes, separated according to mass, transferred to a solid support, 
and hybridized with a probe molecule according to the invention. The pattern of fragments 
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hybridizing to a probe molecule is compared for DNA from different individuals or lines, where 
differences in fragment size signals a polymorphism associated with a particular nucleotide sequence 
according to the present invention. After identification of polymorphic sequences, linkage studies can 
be conducted. After identification of many polymorphisms using a nucleotide sequence according to 

5 the invention, linkage studies can be conducted by using the individuals showing polymorphisms as 
parents in crossing programs. Recombinants, F 2 progeny recombinants or recombinant inbreds, can 
then be analyzed using the same restriction enzyme/hybridization procedure. The order of DNA 
polymorphisms along the chromosomes can be inferred based on the frequency with which they are 
inherited together versus inherited independently. The closer together two polymorphisms occur in a 

10 chromosome, the higher the probability that they are inherited together. Integration of the relative 
positions of polymorphisms and associated marker sequences produces a genetic map of the 
species, where the distances between markers reflect the recombination frequencies in that 
chromosome segment. Preferably, the polymorphisms and marker sequences are sufficiently 
numerous to produce a genetic map of sufficiently high resolution to locate one or more loci of 

15 interest. 

The use of recombinant inbred lines for such genetic mapping is described for rice (Oh et al, 
Mol Cells 8:175 (1998); Nandi et al y Mol Gen Genet 255:1 (1 997); Wang et aL, Genetics 
136:1421 (1994)), sorghum (Subudhi et al, Genome 43:240 (2000)), maize (Bunet al y Genetics 
1 18:519 (1998); Gardiner et aL, Genetics 134:917 (1993)), and Arabidopsis (Methods in 

20 Molecular Biology \ Martinez- Zapater and Salinas, eds., 82: 1 37- 1 46, ( 1 998)). However, this 
procedure is not limited to plants and can be used for other organisms such as yeast or other fungi, 
or for oomycetes or other protistans. 

The nucleotide sequences of the present invention can also be used for simple sequence 
repeat identification, also known as single sequence repeat, (SSR) mapping. SSR mapping in rice 

25 has been described by Miyao et al. (DNA Res 3:233 (1996)) and Yang et al. (Mol Gen Genet 
245: 187 (1994)), and in maize by Ahn et al. (Mol Gen Genet 241 :483 (1993)). SSR mapping can 
be achieved using various methods. In one instance, polymorphisms are identified when sequence 
specific probes flanking an SSR contained within an sequence of the invention are made and used in 
polymerase chain reaction (PCR) assays with template DNA from two or more individuals or, in 
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plants, near isogenic lines. A change in the number of tandem repeats between the SSR- flanking 
sequence produces differently sized fragments (U.S. Patent No. 5,766,847). Alternatively, 
polymorphisms can be identified by using the PCR fragment produced from the SSR- flanking 
sequence specific primer reaction as a probe against Southern blots representing different individuals 
5 (Refseth et aL , Electrophoresis 18:1519(1 997)). Rice SSRs were used to map a molecular 
marker closely linked to a nuclear restorer gene for fertility in rice as described by Akagi et ai 
{Genome 39:205 (1996)). 

The nucleotide sequences of the present invention can be used to identify and develop a 
variety of microsatellite markers, including the SSRs described above, as genetic markers for 

10 comparative analysis and mapping of genomes. The nucleotide sequences of the present invention 
can be used in a variation of the SSR technique known as inter-SSR (ISSR), which uses 
microsatellite oligonucleotides as primers to amplify genomic segments different from the repeat 
region itself (Zietkiewicz et al. , Genomics 20: 1 76 (1 994)). ISSR employs oligonucleotides based 
on a simple sequence repeat anchored or not at their 5'- or 3'-end by two to four arbitrarily chosen 

1 5 nucleotides, which triggers site- specific annealing and initiates PCR amplification of genomic 

segments which are flanked by inversely orientated and closely spaced repeat sequences. In one 
embodiment of the present invention, microsatellite markers derived from the nucleotide sequences 
disclosed in the Sequence Listing, or substantially similar sequences or allelic variants thereof, may be 
used to detect the appearance or disappearance of markers indicating genomic instability as 

20 described by Leroy et al. {Electron, JBiotechnol, 3(2), at http://www.ejb.org (2000)), where 
alteration of a fingerprinting pattern indicated loss of a marker corresponding to a part of a gene 
involved in the regulation of cell proliferation. Microsatellite markers derived from nucleotide 
sequences as provided in the Sequence Listing will be useftil for detecting genomic alterations such 
as the change observed by Leroy et al. {Electron. JBiotechnol, 3(2), supra (2000)) which 

25 appeared to be the consequence of microsatellite instability at the primer binding site or modification 
of the region between the microsatellites, and illustrated somaclonal variation leading to genomic 
instability. Consequently, the nucleotide sequences of the present invention are useful for detecting 
genomic alterations involved in somaclonal variation, which is an important source of new 
phenotypes. 
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In addition, because the genomes of closely related species are largely syntenic (that is, they 
display the same ordering of genes within the genome), these maps can be used to isolate novel 
alleles from wild relatives of crop species by positional cloning strategies. This shared synteny is very 
powerful for using genetic maps from one species to map genes in another. For example, a gene 

5 mapped in rice provides information for the gene location in maize and wheat. 

The various types of maps discussed above can be used with the nucleotide sequences of the 
invention to identify Quantitative Trait Loci (QTLs) for a variety of uses, including marker-assisted 
breeding. Many important crop traits are quantitative traits and result from the combined interactions 
of several genes. These genes reside at different loci in the genome, often on different chromosomes, 

10 and generally exhibit multiple alleles at each locus. Developing markers, tools, and methods to 
identify and isolate the QTLs involved regulating the content and composition of the plant grain, 
enables marker-assisted breeding to enhance the nutritional value of the grain or suppress 
undesirable traits that interfere with an efficient grain filling process. The nucleotide sequences as 
provided in the Sequence Listing can be used to generate markers, including single-sequence repeats 

15 (SSRs) and microsatellite markers for QTLs and utilization to assist marker-assisted breeding. The 
nucleotide sequences of the invention can be used to identify QTLs regulating the grain filling process 
and isolate alleles as described by Li et al in a study of QTLs involved in resistance to a pathogen of 
rice. (Li et al, Mol Gen Genet 261 :58 (1999)). In addition to isolating QTL alleles in rice, other 
cereals, and other monocot and dicot crop species, the nucleotide sequences of the invention can 

20 also be used to isolate alleles from the corresponding QTL(s) of wild relatives. Transgenic plants 
having various combinations of QTL alleles can then be created and the effects of the combinations 
measured. Once an ideal allele combination has been identified, crop improvement can be 
accomplished either through biotechnological means or by directed conventional breeding programs. 
(Flowers et al, J Exp Bot 5 1 :99 (2000); Tanksley and McCouch, Science 277: 1 063 (1 997)). 

25 In another embodiment the nucleotide sequences of the invention can be used to help create 

physical maps of the genome of maize, Arabidopsis and related species. Where the nucleotide 
sequences of the invention have been ordered on a genetic map, as described above, then the 
nucleotide sequences of the invention can be used as probes to discover which clones in large 
libraries of plant DNA fragments in YACs, PACs, etc. contain the same nucleotide sequences of the 
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invention or similar sequences, thereby facilitating the assignment of the large DNA fragments to 
chromosomal positions. Subsequently, the large BACs, YACs, etc. can be ordered unambiguously 
by more detailed studies of their sequence composition and by using their end or other sequence to 
find the identical sequences in other cloned DNA fragments (Mozo et al., Nat Genet 22:21 \ 
5 (1 999)). Overlapping DNA sequences in this way allows assembly of large sequence contigs that, 
when sufficiently extended, provide a complete physical map of a chromosome. The nucleotide 
sequences of the invention themselves may provide the means of joining cloned sequences into a 
contig, and are useful for constructing physical maps. 

In another embodiment, the nucleotide sequences of the present invention may be useful in 

10 mapping and characterizing the genomes of other cereals. Rice has been proposed as a model for 
cereal genome analysis (Havukkala, Curr Opin Genet Devel 6:71 1 (1996)), based largely on its 
smaller genome size and higher gene density, combined with the considerable conserved gene order 
among cereal genomes (Ahn et ai, Mol Gen Genet 241 :483 (1993)). The cereals demonstrate 
both general conservation of gene order (synteny) and considerable sequence homology among 

15 various cereal gene families. This suggests that studies on the functions of genes or proteins from rice 
according to the present invention could lead to elucidation of the functions of orthologous genes or 
proteins in other cereals, including maize, wheat, secale, sorghum, barley, millet, teff, milo, triticale, 
flax, gramma grass, Tripsacum sp., and teosinte. The nucleotide sequences according to the 
invention can also be used to physically characterize homologous chromosomes in other cereals, as 

20 described by Sarma et al. (Genome 43: 191 (2000)), and their use can be extended to non-cereal 
monocots such as sugarcane, grasses, and lilies. 

Given the synteny between rice and other cereal genomes, the nucleotide sequences of the 
present invention can be used to obtain molecular markers for mapping and, potentially, for 
positional cloning. Kilian et ai described the use of probes from the rice genomic region of interest 

25 to isolate a saturating number of polymorphic markers in barley, which were shown to map to 

syntenic regions in rice and barley, suggesting that the nucleotide sequences of the invention derived 
from the rice genome would be useful in positional cloning of syntenic grain-filling genes of interest 
from other cereal species. (Kilian, et al., Nucl Acids Res 23:2729 ( 1 995); Kilian, et al. , Plant Mol 
Biol 35: 1 87 (1 997)). Synteny between rice and barley has recently been reported in the area of the 
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carrying malting quality QTLs (Han, et al, Genome 41 :373 (1998)), and use of synteny between 
cereals for positional cloning efforts is likely to add considerable value to rice genome analysis. 
Likewise, mapping of the ligules region of sorghum was facilitated using molecular markers from a 
syntenic region of the rice genome. (Zwick, et aL, Genetics 148:1983 (1998)). 

5 Rice marker technology utilizing the nucleotide sequences of the present invention can also be 

used to identify QTL alleles from a wild relative of cultivated rice, for example as described by Xiao, 
et aL {Genetics 150:899 (1998)). Wild relatives of domesticated plants represent untapped pools 
of genetic resources for abiotic and biotic stress resistance, apomixis and other breeding strategies, 
plant architecture, determinants of yield, secondary metabolites, and other valuable traits. In rice, 

10 Xiao et a I. (supra) used molecular markers to introduce an average of approximately 5% of the 
genome of a wild relative, and the resulting plants were scored for phenotypes such as plant height, 
panicle length and 1000- grain weight Trait- improving alleles were found for all phenotypes except 
plant height, where any change is considered negative. Of the 35 trait- improving alleles, Xiao et al. 
found that 19 had no effect on other phenotypes whereas 16 had deleterious effects on other traits. 

15 The nucleotide sequences of the invention such as those provided in the Sequence Listing can be 
employed as molecular markers to identify QTL alleles involved in the regulation of the grain filling 
process from a wild relative, by which these valuable traits can be introgressed from wild relatives 
using methods including, but not limited to, that described by Xiao et al. ((1998) supra). 
Accordingly, the nucleotide sequences of the invention can be employed in a variety of molecular 

20 marker technologies for yield improvement. 

Following the procedures described above to identify polymorphisms, and using a plurality of 
the nucleotide sequences of the invention, any individual (or line) can be genotyped. Genotyping a 
large number of DNA polymorphisms such as single nucleotide polymorphisms (SNPs), in breeding 
lines makes it possible to find associations between certain polymorphisms or groups of 

25 polymorphisms, and certain phenotypes. In addition to sequence polymorphisms, length 

polymorphisms such as triplet repeats are studied to find associations between polymorphism and 
phenotype. Genotypes can be used for the identification of particular cultivars, varieties, lines, 
ecotypes, and genetically modified plants or can serve as tools for subsequent genetic studies of 
complex traits involving multiple phenotypes. 
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The patent publication WO95/35505 and U.S. Patent Nos. 5,445,943 and 5,410,270 
describe scanning multiple alleles of a plurality of loci using hybridization to arrays of 
oligonucleotides. The nucleotide sequences of the invention are suitable for use in genotyping 
techniques useful for each of the types of mapping discussed above. 
5 In a preferred embodiment, the nucleotide sequences of the invention are useful for identifying 

and isolating a least one unique stretch of protein-encoding nucleotide sequence. The nucleotide 
sequences of the invention are compared with other coding sequences having sequence similarity 
with the sequences provided in the Sequence Listing, using a program such as BLAST. Comparison 
of the nucleotide sequences of the invention with other similar coding sequences permits the 

10 identification of one or more unique stretches of coding sequences encoding polypeptides that are 
' up-regulated during grain filling that are not identical to the corresponding coding sequence being 
screened. Preferably, a unique stretch of coding sequence of about 25 base pairs (bp) long is 
identified, more preferably 25 bp, or even more preferably 22 bp, or 20 bp, or yet even more 
preferably 18 bp or 16 bp or 14 bp. In one embodiment, a plurality of nucleotide sequences is 

15 screened to identify unique coding sequences accroding to the invention. In one embodiment, one or 
more unique coding sequences accroding to the invention can be applied to a chip as part of an 
array, or used in a non-chip array system. In a further embodiment, a plurality of unique coding 
sequences accroding to the invention is used in a screening array. In another embodiment, one or 
more unique coding sequences accroding to the invention can be used as immobilized or as probes in 

20 solution. In yet another embodiment, one or more unique coding sequences accroding to the 
invention can be used as primers for PCR. In a further embodiment, one or more unique coding 
sequences accroding to the invention can be used as organism- specific primers for PCR in a solution 
containing DNA from a plurality of sources. 

In another embodiment unique stretches of nucleotide sequences according to the invention are 

25 identified that are preferably about 30 bp, more preferably 50 bp or 75 bp, yet more preferably 100 
bp, 150 bp, 200 bp, 250, 500 bp, 750 bp, or 1000 bp. The length of an unique coding sequence 
may be chosen by one of skill in the art depending on its intended use and on the characteristics of 
the nucleotide sequence being used. In one embodiment, unique coding sequences accroding to the 
invention may be used as probes to screen libraries to find homologs, orthologs, or paralogs. In 
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another embodiment, unique coding sequences accroding to the invention may be used as probes to 
screen genomic DNA or cDNA to find homologs, orthologs, or paralogs. In yet another 
embodiment, unique coding sequences accroding to the invention may be used to study gene 
evolution and genome evolution. 

5 

EXAMPLES 

The invention will be further described by reference to the following detailed examples. These 
examples are provided for purposes of illustration only, and are not intended to be limiting unless 
otherwise specified. Standard recombinant DNA and molecular cloning techniques used here are 
10 well known in the art and are described in detail in Sambrook et al {Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory Press (1989)) and by Ausubel et al 
{Current Protocols in Molecular Biology, Greene Publishing (1992)). 

Example 1: Isolation and sequencing of DNA fragments 

15 7. 1 Isolation and sequencing of genomic DNA fragments 

Genomic DNA was isolated from nuclei of Oryza sativa L. ssp japonica cv Nipponbare and 
then sheared to produce fragments of approximately 500 bp. Using a method derived from the 
method of Mao et al. {Genome Res 10:982 (2000)), seeds were germinated on cheese cloth 
immersed in water and grown for 4-6 weeks under greenhouse conditions. After plants reached a 

20 height of approximately 5-8 inches, the upper parts of the green leaves were harvested and wrapped . 
in aluminum foil at 4°C overnight. Leaf material was then stored at -80°C or directly used for 
extraction of nuclei. Intact nuclei were isolated by homogenization (in a blender for fresh material or 
by grinding with mortar and pestle for frozen material) in a buffer containing 1 0 mM Trizma base, 80 
mM KCI, 10 mM EDTA, 1 mM spermidine, 1 mM spermine, 0.5 M sucrose, 0.5% Tritoi>X-100, 

25 0. 1 5% P-mercaptoethanol pH 9.5. The homogenate was filtered and nuclei recovered by gentle 
centrifugation using a fixed-angle rotor at 1 ,800 g at 4 C for 20 minutes. The pellet recovered after 
centrifugation was gently resuspended with the assistance of a small paint brush soaked in ice cold 
wash buffer and wash buffer added. Particulate matter remaining in the suspension was removed by 
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filtering the resuspended nuclei into a 50 ml centrifuge tube through two layers of minicloth by gravity 
and centrifuging the filtrate at 57 g (500 rpm), 4 C for 2 minutes to remove intact cells and tissue 
residues. The supernatant fluid was transferred into a fresh centrifuge tube and nuclei were pelleted 
by centrifiigation at 1,800 g, 4 C for 15 minutes in a swinging bucket centrifuge. 

5 DNA was isolated from the nuclear preparation by phenol/chloroform extraction, as in 

Sambrook et al (supra). Isolated total genomic DNA was physically sheared (Hydroshear) to 
generate for generating random DNA fragments, and fragments of approximately 500 bp were 
recovered. DNA was eluted and the ends filled in using T 4 DNA polymerase, Klenow fragments, 
and dNTPs. Double-stranded DNA was linkered and cloned into a Novartis proprietary medium- 

] 0 copy vector derived from pSC 101. 

Vector inserts were amplified by PCR and sequenced using the MegaBACE sequencing 
system (Molecular Dynamics, Amersham). The amplification reaction was diluted before use and 
was not purified using an exonuclease/alkaline phosphatase procedure. Sequencing reactions were 
performed using DYEnamic ET Terminator Kit The reactions contained approximately 50 ng of 

1 5 amplicon, DYEnamic ET Terminator premix, and 5 pmol of -40 M 1 3 forward primer. The 

sequencing reaction is amplified for 30 cycles, and reaction products are concentrated and purified 
using ethanol precipitation. The sample was electrokinetically injected into the capillary at 3 kV for 
45 sec and separated via electrophoresis at 9 kV for 120 min. 
1.2 Isolation and sequencing of cDNA fragments 

20 Construction of rice cDNA library. Total RNA was purified from rice plant tissue using 

standard total RNA purification methods. PolyA+ RNA was isolated from the total RNA using the 
Qiagen Oligotex mRNA purification system (Qiagen, Valencia, CA), and cDNA was generated 
using cDNA synthesis reagents from Life Technologies (Rockville, MD). First strand cDN A 
synthesis was catalyzed by reverse transcriptase using oligo dT primers with a NotI restriction site. 

25 Second strand synthesis was catalyzed by DNA polymerase. An oligonucleotide linker with a Sail 
restriction endonuclease site was attached to the 5' end of the cDNAs using DNA ligase. The 
cDNAs were digested with NotI and Sail restriction endonucleases and inserted into an E. coli- 
replicating plasmid harboring a selectable marker. £. coli was transfected with the recombinant 
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plasmids and grown on selectable media. E. coli colonies were individually picked off the selectable 
media and placed into storage plates. 

Sequencing the rice cDNA library, The DNA sequence of the cDNA cloned into the 
plasmid purified from an E. coli colony was determined using standard dideoxy sequencing methods. 

5 Oligonucleotide primers respectively corresponding to plasmid DNA regions upstream of the 5' end 
of the cDNA insert (Forward reaction) and downstream of the 3' end of the cDNA insert (Reverse 
reaction) were used in the dideoxy sequencing reactions. If the DNA sequence determined as a 
result of the Forward and Reverse reactions from the cDNA overlapped, the two sequences could 
be merged into a contig using computerized analysis software (Consed, University of 

10 Washington,Seattle), to assemble a full-length sequence of the cDNA. In cases case where DNA 
sequence from the Forward and Reverse reactions from a single clone did not overlap sufficiently to 
be assembled into a contig, such that there was a region of unsequenced DNA to bridge the DNA 
from the Forward and Reverse reaction in order to form a contig, the DNA sequence of the 
separating region was determined using one of two dideoxy sequencing methods. In a ''primer 

15 walking" approach, a primer specifically corresponding to the 3' end of the DNA sequence 

determined from the Forward reaction was used in a second dedeoxy sequencing reaction. The 
primer walking procedure was repeated until the DNA sequence that separated the original Forward 
and Reverse was resolved and a contig could be assembled. Alternatively, the clone harboring the 
cDNA was subjected to transposon in vitro insertion dideoxysequencing (Epicentre, Madison, WI). 

20 In this procedure, the insertion process was random and the result was multiple DNA sequence 

coverage over the targeted cDNA, where the sequences thus obtained were assembled into a contig. 



Example 2: GeneChip© Standard Protocol 

The standard protocol for using the GeneChip® to quantitatively measure plant gene 
25 expression was carried out as outlined below: 
Quantitation of total RN A 

30 Total RNA from plant tissue was extracted and quantifiedQuantified total RNA using 
GeneQuant 
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1 OD26o=40 mg RNA/ml; A 26 o/A 2 8o=l 9 to about 2. 1 
2. Ran gel to check the integrity and purity of the extracted RNA 
Synthesis of double-stranded cDNA 

Gibco/BRL Superscript Choice System for cDNA Synthesis (Cat#lB090-019) was 
5 employed to prepare cDNAs. T7-(dT)24 oligonucleotides were prepared and purified by HPLC. 
(5'- GGCCAGTGAA^TGTAATACGACTCACTATAGGGAGGCGG-(dT)24-3 , ; SEQID 
NO:4709). 

Step 1 . Primer hybridization: 



Incubated at 70°C for 10 minutes 



10 



Spun quickly and put on ice briefly 
Step 2. Temperature adjustment: 



Incubated at 42°C for 2 minutes 



Step 3, First strand synthesis carried out using: 



DEPC-water- 1 :l 



15 



RNA(10:gfinal)-10:l 

T7-(dT)24 Primer (100 pmol final)- 1 :1 pmol 



5X I s ' strand cDNA buffer-4 :1 



20 



0.1MDTT (10 mM final)- 2:1 

10 mM dNTP mix (500 :M final)-! :1 

Superscript II RT 200 U/:l- 1 :1 



Total of 20 :1 



Mixed well 



Incubated at 42°C for 1 hour 



25 



Step 4. Second strand synthesis: 

Placed reactions on ice, quick spin 



DEPC-water- 91 :1 



5X 2 nd strand cDNA buffer- 30 :1 



10 mM dNTP mix (250 mM final) - 3 :1 



E. coli DNA ligase (10 U/:I)-1 :I 
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E. coli DNA polymerase 1-10 U/:I- 4 :1 
RnaseH2U/:l-l :l 
T4 DNA polymerase 5 U/:l-2 :I 
0.5 M EDTA (0.5 M final)— 10 :1 
5 Total 162 :1 

Mixed/spun down/incubated 16°C for 2 hours 
Step 5. Completing the reaction: 

Incubated at 16°C for 5 minutes 
Purification of double stranded cDNA 
10 1 . Centrifuged PLG (Phase Lock Gel, Eppendorf 5 Prime Inc., pi- 1 88233) at 14,000X, 

transfered 162 :1 of cDNA to PLG 

2. Added 162 :1 of Phenol :Chloroform:Isoamyl alcohol (pH 8.0), centrifuge 2 minutes 

3. Transfered the supernatant to a fresh 1 .5 ml tube, add 

Glycogen (5 mg/ml) 2 
15 0.5 M NRiOAC (0.75xVol) 120 

ETOH (2.5xVol, -20°C) 400 

4. Mixed well and centrifuge at 14,000X for 20 minutes 

5. Removed supernatant, added 0.5 ml 80% EtOH (-20°C) 

6. Centrifuged for 5 minutes, air dry or by speed vac for 5-10 minutes 
20 7. Added 44:1 DEPCH 2 0 

Analyzed quantity and size distribution of cDNA 

Ran a gel using 1:1 ratio of the double-stranded synthesis product to loading buffer 
Synthesis of biotinvlated cRNA 

(used Enzo BioArray High Yield RNA Transcript Labeling Kit Cat#900182) 



25 Purified cDNA 22 :1 

lOXHy buffer 4:1 

1 OX biotin ribonucleotides 4 :l 
10XDTT 4:1 

1 OX Rnase inhibitor mix 4 :1 
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20X T7 RNA polymerase 



2:1 



Total 40 :1 

Centrifuged 5 seconds, and incubated for 4 hours at 37°C 
Gently mixed every 30-45 minutes 
5 Purification and quantification of cRNA 

(used Qiagen Rneasy Mini kit Cat# 741 03) 
cRNA 40 :1 

DEPC H 2 0 60 :1 

RLT buffer 350 :1 mix by vortexing 

10 EtOH 250 :1 mix by pipetting 

Total 700 :1 

Waited 1 minute or more for the RNA to stick 
Centrifuged at 2000 rpm for 5 minutes 

RPE buffer 500:1 
15 Centrifuged at 10,000 rpm for 1 minute 

RPE buffer 500:1 
Centrifuged at 10,000 rpm for 1 minute 
Centrifuged at 10,000 rpm for 1 minute to dry the column 
DEPC H 2 0 30 :) 

20 Waited for 1 minute, then elute cRNA from by centrifiigation, 1 OK 1 minute 
DEPCH 2 0 30:1 
Repeated previous step 

Determined concentration and dilute to 1 :g/:l concentration 
Fragmentation of cRNA 
25 cRNA (1 15 :1 

6:1 
9J 
30 :I 



cRNA(l :gf:\) 

5X Fragmentation Buffer* 

DEPCH 2 0 
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*5x Fragmentation Buffer 

lMTris(pH8.1) 4.0 ml 
MgOAc 0.64 g 
KOAC 0.98 g 

DEPC H 2 0 
Total 20 ml 

Filter Sterilize 
Array washed and stained in: 

Stringent Wash Buffer** 
Non- Stringent Wash Buffer*** 
SAFE Stain**** 
Antibody Stain***** 
Washed on fluidics station using the appropriate antibody amplification protocol 

**Stringent Buffer: 12X MES 83.3 ml, 5 M NaCl 5.2 ml, 10% Tween 1.0 ml, H 2 0 
910 ml, 

Filter Sterilize 

***Non-Stringent Buffer. 20X SSPE 300 ml, 10% Tween 1 .0 ml, H 2 0 698 ml, Filter 
Sterilize, An ti foam 1 .0. 
****SAPE stain: 2X Stain Buffer 600 :1, BSA 48 :1, SAPE 12:1, H 2 0 540 :L 
*****Antibody Stain: 2X Stain Buffer 300 :1, H 2 0 266.4 :l, BSA 24 :l, Goat lgG 6 :1, 
Biotinylated Ab 3.6 :1 

Example 3: Profiling of genes involved in nutrition partitioning during grain development 

A GeneChip® Rice Genome Array (Afryrnetrix, Santa Clara, CA) was used to examine how 
accumulation of carbohydrates, storage protein and fatty acids is coordinated at RNA level during 
grain development. 



- 129- 



WO 03/000905 



PCT/IB02/02450 



RNA expression of three major pathways and associated genes involving nutrition partitioning 
was examined, including synthesis and transport of carbohydrates, proteins, and fatty acids. A total 
of 491 genes involved in these pathways were first selected based on their sequence annotation and 
functional classification. RNA expression was determined in 39 samples representing different 
5 developmental stages including samples collected before and during grain filling. 

3 J Plant Growth Conditions and Sampling 

Nipponbare rice was grown in the greenhouse with 12 hr light cycle and temperature of 29° C 
during the day and 21 ° C during the night. Humidity was maintained at 30%. Plants were grown in 
pots containing 50% sunshine mix and 50% nitrohumus. The descriptions of the samples collected 
10 for this analysis are listed in table 1 . Individual tissues were collected from a minimum of five plants 
and pooled. Total RNA was extracted from one gram of tissue using the Qiagen RNA Easy Maxikit 
(Qiagen, Valencia, CA). 

The experiments were carried out as described in T. Zhu et al Plant Physiol. Biochem.3% 
221 (2001). 



Table I Rice samples included in the study of genes involved in nutrition partitioning during grain 
development 



Description 


Days after 
germination 


developmental stage 


Rank 


Category 


germinating seedling (root) 


5 


11 


1 


root 


germinating seedling [LEAF] 


5 


12 


1 


leaf 


3-4 leaf arial 


18 


13 


2 


arial 


tillering stage (root) 


49 


14 


! 3 


root 


tillering stage (leaf) 


49 


15 


3 


leaf 


tillering stage (arial) 


49 


16 


3 


arial 


Booting Stage panicle 1-3 cm 


60 


17 


4 


repr 


Booting stage panicle 4-7 cm 


62 


18 


5 


repr 


Booting Stage panicle 8- 14 cm 


64 


19 


6 


repr 


Booting Stage panicle 1 5-20 cm 


66 


20 


7 


repr 


Booting Stage root 


60 


22 


6 


root 


Booting Stage leaf 


60 


23 


6 


leaf 


Booting stage arial 


60 


24 


6 


arial 


panicle emergence- root 


78 


25 


8 


root 


panicle emergence- stem 


78 


26 


8 


stem 
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panicle emergence -panicle 


78 


21 


8 


repr 


Seed milk stage [-9DAF] 


88 


39 




repr 


Seed -soft dough [-14DAF] 


94 


40 


14 


repr 


Seed hard dough [-21DAF] 


100 


41 


15 


repr 


inflorescence- no seeds 


88 


30 


9 


repr 


maturation stem 


90 


27 


15 


stem 


maturation root 


90 


28 


15 


root 


maturation leaf 


90 


29 


15 


leaf 


embryo 


88 


42 


14 


embryo 


endosperm 


88 


43 


14 


endospm 


seed coat 


88 


44 


14 


coat 


Senescence -stem 


100 


31 


16 


stem 


Senescence [LEAP] 


100 


32 


16 


leaf 


aleurone 


88 


45 


14 


aleurone 


pollen mixed 


55 


33 




pollen 


seed day 0 post anthesis 


79 


34 


9 


repr 


seed day 2 post anthesis 


81 


35 


10 


repr 


seed day 4 post anthesis 


83 


36 


11 


repr 


seed day 7 post anthesis 


86 


37 


12 


repr 


seed day 8 post anthesis 


87 


38 


13 


repr 



Example 4: Characterization of Gene Expression Profiles 
4. 1 Data analysis I 

A rice gene array and probes derived from rice RNA extracted from different tissues and 

5 developmental stages were used to identify the expression profile of genes on the chip. The rice 
array contains over 23,000 genes (approximately 18,000 unique genes) or roughly 50% of the rice 
genome and is similar to the Arabidopsis GeneChip® (Aflymetrix) with the exception that the 16 
oligonucleotide probe sets do not contain mismatch probe sets. The level of expression is therefore 
determined by internal software that analyzes the intensity level of the 16 probe sets for each gene. 

10 The highest and lowest probes are removed if they do not fit into a set of predefined statistical 

criteria and the remaining sets are averaged to give an expression value. The final expression values 
are normalized by software, as described below. The advantages of a gene chip in such an analysis 
include a global gene expression analysis, quantitative results, a highly reproducible system, and a 
higher sensitivity than Northern blot analyses. 
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4,2 Data analysis II 

Data analysis was done using GeneSpring (Silicon Genetics, Redwood, CA) and AlignAce. 
The genechip sequence was blasted to the AC rice contig sequences. The contig with the best 
alignment was extracted and five gene prediction programs were run on each contig. The programs 

5 used were Genscan trained on arabidopsis and maize, Gmhmm trained on rice and Arabidopsis, and 
Fgenesh and Glimmer trained on rice. All of the predicted CDSs were blasted against the genechip 
sequence again to extract the top hit predicted CDS. A Perl script was utilized to extract up to 2 kb 
of the putative promoter sequence. In some of the genechip sequences there was more than one 
perfect alignment to a predicted CDS; in such cases, both of the perfect alignments were accepted 

10 as the putative genes. 

Table 2 : : Table 2 provides provides a subset of rice genes the expression of which is up- 
regulated during grain filling. 

Further identified are SSR sequences in the coding region of the rice genes. 



A = Genes involved in rice grain filling, which belong to the functional category of 

Carborhydrate Metabolism 

B = Genes involved in rice grain filling, which belong to the functional category of 

transmembrane proteins 

C = Genes involved in rice grain filling, which belong to the functional category of storage 

proteins 

D = Genes involved in rice grain filling, which belong to the functional category of stress 

response proteins 

E = 345 Grain Filling Genes 

F = Genes involved in rice grain filling, which belong to the functional category of signaling 

molecules 

G = Genes involved in rice grain filling, which belong to the functional category of 

transcription factors 

H = Genes involved in rice grain filling, which belong to the functional category of amino 

acid Metabolism 

I = Genes involved in rice grain filling, which belong to the functional category of Fatty Acid 
Metabolism 

J = CereaLGrain_Filling_QTLs (a description of the respective QTLs is provided in Table 
. . . below) 

K= Beginning of the SSR 
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L = End of the SSR 



M = Nucleotide Sequence of the ti> and tetra- nucleotide repeat units 




SEQ 
ID 


A 


B 


c 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


101 


X 








X 


















113 


X 








X 












42 


59 


CCT 


I 








A 


A 


















317 


X 








-\r 

A 




X 














329 










X 










OS-FLLEN-9-i, 
Ob-CjPL-4-I, 
Uo-urr-4- 1 , 

OS-GW10O4-1, 








1 f j 


v 

A 








Y 
A 




























A 










Uo-UW-> 1 

ZM-MOIST-4-3 

2M-DMY-4-3, 

ZM-YIiM-l 




1 0 




333 










X 


















233 


. 




X 




X 


















335 








- 


X 


















119 


X 








X 


- 




- 


- 










311 


X 








X 




X 








358 
661 


372 
675 


CGC 
CGG 


149 


X 








X 


















337 










X 


















59 




X 






X 


















339 










X 


















155 


X 








X 












1207 


1221 


CTG 


143 


X 








X 


















307 










X 




X 








155 


175 


CTG 


341 










X 


















193 


X 








X 










SMS015-9, 

ZM-MOIST-4-2, 

ZM-DMY-4-1 


1401 


1415 


CGT 


131 


X 








X 
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199 


X 








X 










OS-AE-1-1, 

OS-AE-5-1, 

OS-APDF-9-1, 

OS-RBGEN-3-1, 

OS-RGT-5-1, 

OS-VGT-2-2, 

OS-VGT-5-1, 

OS-GC-2-1, 

OS-GYLD-1-1, 

SMS021-80, 

ZM-CPC-5-1, 

ZM-ED-5-1, 

ZM-IVDOM-5-1, 

ZM-IVDOM-5-2, 

ZM-IVDOM-5-3, 

ZM-MOIST-5-2, 

ZM-MOIST-5-2, 

ZM-MOIST-5-3, 

ZM-B10M-5-1, 

ZM-DMC-6-2, 

ZM-DMY-5-1, 

ZM-GYLD-5-1 

ZM-GYLD-5-3, 

ZM-GYLD-5-3, 

ZM-GYLD-64, 

ZM-GYLD-64, 

ZM-KW300-5-1, 

ZM-TW-5-1, 

ZM-YLD-6-1 


207 


221 


CGC 


301 


- 


- 


- 




X 




X 


- 


- 


OS-VGT-2-2, 
OS-GC-2-1 








343 








- 


X 


- 








OS-FLLEN-3-1, 

OS-GPL-2-1, 

OS-GYLD-2-1, 

ZM-ID-5-2, 

ZM-M0IST4-3, 

ZM-M01ST-54, 

ZM-PC-5-1, 

ZM-STC-5-1, 

ZM-DMC-5-1, 

ZM-DMY4-3, 

ZM-GYLD5-2 








287 










X 






X 













- 134- 



WO 03/000905 



PCT/IB02/02450 



191 


X 








X 


















215 






X 




X 








— 




373 
972 


387 
986 


TCG 
CCG 


23 










X 


X 








ZM-M01ST-2-3, 

2M-STC-2-2, 

ZM-DMY-2-3, 

ZM-DMY-2-4, 

ZM-GYLD-2-3 








147 


X 








X 


















345 










X 


















347 


X 


- 


- 


- 


X 




- 


- 


- 


OS-GPDF-1-1, 

SMS015-16, 

ZM-CL-9-1, 

ZM-CPC-3-1, 

ZM-CPC-3-3, 

ZM-CPC-8-1, 

ZM-ID-8-1, 

ZM-ID-8-1, 

ZM-ID-8-1, 

ZM-IVDOM-3-1, 

ZM-IVDOM-3-3, 

ZM-M01ST-8-1, 

ZM-MOIST-8-2, 

ZM-MOIST-9-2, 

ZM-PC-8-1, 

ZM-PC-9-1, 

ZM-PR-9-1, 

ZM-SIC-8-1, 

ZM-BIOM-8-1, 

ZM-DMC-8-1, 

ZM-DMC-8-2, 

2M-DMY-3-2, 

ZM-DMY-3-3, 

ZM-DMY-8-1, 

ZM-DMY-8-2, 

ZM-GWE-9-1, 

ZM-GWM2-3-1, 

ZM-GYHA-8-1, 
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ZM-GYLD-8-2, 

ZM-GYLD-9-1, 

ZM-H-3-1, 

ZM-M-8-1, 

ZM-KW 100-9-1 

ZM-KW300-3-2, 

ZM-KW30O-8-2, 

ZM-KW300-9-2, 

ZM-TGW-9-1, 

ZM-TW-8-1, 

ZM-YLD-9-1, 

ZM-YLD-9-1 








157 


X 


- 


- 


- 


X 


- 


- 


- 


- 


MAS24-2, 

ZM-CPC-1-4, 

ZM-CPC-l-6, 

ZM-MOIST-4-3, 

ZM-M01ST-7-3, 

ZM-MOlST-7-4, 

ZM-MOIST-9-2, 

ZM-MOIST-9-2, 

ZM-PC-9-1, 

ZM-BIOM-3-1, 

ZM-DMC-1-2, 

ZM-DMY-1-3, 

ZM-DMY-1-5, 

ZM-DMY-4-3, 

ZM-GWM2-3-2, 

ZM-GYLD-3-3, 

TM-GYT D-9-t 

ZM-GYUI-9-1 

ZM-GYUI-9-2, 

ZM-GYUP-9-2, 

ZM-KW1 00-9-1, 

ZM-KW30O-9-1, 

ZM-KW300-9-2, 

ZM-YLD-9-1 


126 


140 


CCT 


349 










X 


















139 


X 








X 


















175 


X 








X 


















5 








X 


X 


















351 










X 


















353 


X 








X 
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309 










X 




X 






OS-RGT-2-1, 
OS-VOT-2-1 


378 


392 


CAA 


355 










X 


















255 


- 


- 


- 


- 


X 


- 




- 


X 


OS-GW-9-1, 

MAS13-24, 

MAS13-31, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-CPC-7-2, 

ZM-CPC-7-3, 

ZM-IVDOM-1-2, 

ZM-IVDOM-1-4, 

ZM-MOIST-M, 

ZM-MOIST-1-5, 

ZM-MOIST-7-1, 

ZM-MOIST-7-2, 

ZM-PC-1-1, 

ZM-STC-7-2, 

ZM-BIOM-7-1, 

7U r^KAC 1 1 
£JV1-Ulvl\^- 1- 1, 

7M-DMY- 1 -4 

ZM-GWM2-7-1 

ZM-GYLD-7-3, 

ZM-GYUP-1-2, 

ZM-HI-7-1, 

ZM-KW300-1-2, 

ZM-TW-1-1 








75 


X 








X 


















357 










X 


















359 










X 










OS-GW-5-1, 

OS-YDD-5-1, 

ZM-MOlST-4-3, 

ZM-DMY-4-3, 

ZM-YLD-4-1 








361 










X 
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363 










X 










OS-GW-3-1, 

ZM-CPC-1-2, 

ZM-IVDOM-1-1, 

ZM-IVDOM-9-1, 

ZM-IVDOM-9-2, 

ZM-MOIST-1-2, 

ZM-MOIST-1-2, 

ZM-MOIST-9-3, 

ZM-DMY-9-1, 

ZM-GYHA-1-3, 

ZM-GYHA-1-4, 

ZM-GYLD-1-1, 

ZM-GYLD-9-2, 

ZM-GYLD-9-2, 

ZJV1-VJ I Ur- 1- J , 

7M rrvi TP 1 1 

C-AVl 1 11 11, 

7M-KW100-U2 
ZM-KW 100-9-1 
ZM-TGW-9-2, 
ZM-TW-9-1, 
ZM-YLD-1-1 








365 










X 


















181 


X 








X 


















367 




- 




- 


X 


- 


- 


- 












261 










X 








X 










221 






x 




x 


















57 




X 






X 


















25 










X 


X 










1047 


1061 


CGC 


369 










X 










OS-CHALK- 10- 
1. 

ZM-MOIST-2-3, 

ZM-DMY-2-3, 

ZM-GYLD-2-3 








39 




X 






X 
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87 


X 








X 










OS-APDF-9-1, 

MAS13-24, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-IVDOM-l-2, 

ZM-IVDOM-1-4, 

ZM-MOIST-1-4, 

ZM-MOIST-1-5, 

ZM-MOIST-2-3, 

ZM-PC-1-1, 

ZM-STC-2-2, 

ZM-DMC-1-1, 

ZM-DMY-1-4, 

ZM-DMY-2-3, 

7M.nMY.?-4 

Z_J VI l^i VI I -^-*tj 

ZM-GYID-2-1 

ZM-GYLD-2-3, 

ZM-GYUP-1-2, 

ZM-KW300-1-2, 

ZM-TW-1-1, 

ZM-YID-2-1, 

ZM-YLD-2-2 


30 
1391 


44 
1411 


CCT 
CCG 


371 










X 


















163 


X 








X 


















373 










X 


















313 










X 




X 






OS-GW-5-1, 
OS-YID-5-1 








375 










X 


















315 


X 


- 


- 




- 


X 


- 


X 


- 


- 


0S-GPL4-1, 

OS-GPP-4-1, 

OS-GYLIM-1, 

MAS24-2, 

ZM-CPC-3-2, 

ZM-1D-10-1, 

2M-ID-2-1, 

ZM-MOIST-10- 

1, 

ZM-M01ST-2-2, 
ZM-MOIST-3-2, 
2M-MOIST-9-2, 
2M-PC-9-1, 

zm-stc-io-i, 

ZM-B10M-3-1, 


683 


703 


CCG 
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ZM-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-2-3, 

ZM-DMY-10-1, 

ZM-DMY-3-1, 

ZM-EWT-2-1, 

ZM-GWM2-10- 

1, 

ZM-GWM2-3-2, 

ZM-GYHA-3-1, 

ZM-GYLD-2-2, 

ZM-GYLD-3-3, 

ZM-GYUI-9-1, 

7M-GYUI-9-2 

ZM-GYUP-9-2 

ZM-HI-10-1 

* ■* ~ *■ M MM MV Ay 

ZM-KW300-3-3, 

ZM-KW300-9-1, 

ZM-KW300-9-2, 

ZM-TW-10-2, 

ZM-TW-2-3 








89 


X 








X 


















377 










X 


















289 










X 






X 












49 




X 






X 








- 










153 


X 


X 






X 


















81 


X 








X 


















379 










X 


- 










707 
882 


721 
902 


CGC 
GGA 


305 










X 




X 






OS-BDV-1-1, 

OS-CHALK-l-l, 

OS-CPV-1-1, 

OS-CSV- 1-1, 

OS-SBV-1-1, 

OS-GP-1-1, 

OS-GW-1-2, 

OS-YLD-1-1, 

ZM-MOIST-1-1, 

ZM-M01ST-1-2, 
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ZM-GYHA-1-2, 

ZM-GYHA-1-3, 

ZM-GYUP-1-1, 

ZM-H-1-1, 

ZM-KW100-1-2 








381 










X 










OS-GPL4-1, 

OS-GPP-4-1, 

OS-GYLCH-1, 

MAS24-2, 

ZM-CPC-3-2, 

ZM-ID-10-1, 

ZM-ID-2-1, 

ZM-MOIST-10- 

1. 

ZM-MOIST-2-2, 

ZM-MOIST-3-2, 

ZM-MOIST-9-2, 

ZM-PC-9-1, 

ZM-SKM0-1, 

ZM-BIOM-3-1, 

ZM-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-2-3, 

ZM-DMY-10-1, 

ZM-DMY-3-1, 

2M-EWT-2-1, 

ZM-GWM2-10- 

1. 

ZM-GWM2-3-2, 

ZM-GYHA-3-1, 

ZM-GYLD-2-2, 

ZM-GYLD-3-3, 

ZM-GYUI-9-1, 

ZM-GYUI-9-2, 

ZM-GYUP-9-2, 

ZM-HI-10-1, 

ZM-KW300-3-3, 

ZM-KW300-9-1, 

ZM-KW300-9-2, 

ZM-TW-10-2, 

ZM-TW-2-3 








197 


X 








X 


















45 




X 






X 
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97 


X 








X 






_ 


_ 










383 










X 






_ 


_ 










135 


X 








X 








— 










267 


X 








X 






_ 


X 




217 


234 


CCG 


385 










X 












90 
575 


107 
592 


CGG 
CCG 


33 




X 






X 


















283 










X 






X 






391 


408 


CGG 


53 




X 






X 


















253 










X 








X 










387 










X 


















295- 






- 


- 


X 


- 


- 


X 




OS-GPL4-1, 

OS-GPP-4-1, 

OS-GYLIM-1, 

MAS24-2, 

ZM-CPC-3-2, 

ZM-ID-10-1, 

ZM-ID-2-1, 

ZM-MOIST-10- 

1, 

2M-M01ST-2-2, 

ZM-MOlST-3-2, 

2M-MOIST-9-2, 

ZM-PC-9-1, 

ZM-STC-10-1, 

ZM-BIOM-3-1, 

2M-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-2-3, 

ZM-DMY-10-1, 

ZM-ESVIY-3-1, 

ZM-EWT-2-1, 

ZM-GWM2-10- 

1 

ZM-GWM2-3-2, 

ZM-GYHA-3-1, 

ZM-GY1D-2-2, 

ZM-GYLD-3-3, 

ZM-GYU-9-1, 
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TKM C*\/l TT fi O 

zivi-UYUi-y-z, 

7\a rrvr TP 0 0 

7A4-T4T 10-1 

ZM-KW300-9-1, 
ZM-KW30O-9-2, 
ZM-TW-10-2, 
ZM-TW-2-3 








389 










X 


















225 






X 




X 


















391 










X 


















167 


X 


- 




- 


X 


- 


- 


- 


- 


OS-GW-3-1, 

MAS19-14, 

SMS021-79, 

ZM-CL-9-1, 

ZM-CPC-1-2, 

2M-CPC-6-2, 

ZM-ID-8-1, 

ZM-ID-8-1, 

ZM-IVDOM-1-1, 

ZM-IVDOM-1-3, 

ZM-IVDOM-9-1, 

ZM-IVDOM-9-2, 

ZM-MOIST-1-2, 

ZM-MOIST-1-3, 

ZM-MOIST-4-3, 

ZM-MOIST-9-3, 

ZM-PC-8-1, 

ZM-PC-9-1, 

ZM-PR-9-1, 

ZM-BIOM-8-1, 

ZM-DMC-6-1, 

ZM-DMC-8-1, 

ZM-DMY-1-2, 

7M-DMY-4-3 

ZM-DMY-8-2, 

ZM-DMY-9-1, 

ZM-GWE-9-1, 

2M-GYHA-1-1, 

ZM-GYHA-1-4, 

ZM-GYHA-8-1, 
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ZM-GYLD-1-1, 

ZM-GYLD-1-2, 

ZM-GYLD-6-1, 

ZM-GYLD-6-4, 

ZM-GYLD-9-2, 

ZM-GYLD-9-2, 

ZM-GYUP-1-1, 

ZM-HI-1-1, 

2M-HI-8-1 

ZM-KW 100-9-1 

ZM-KW300-8-2, 

2M-TGW-9-1, 

ZM-TGW-9-2, 

ZM-TW-9-1, 

ZM-YLD-1-1, 

ZM-YLD-9-1 








137 


X 


_ 


_ 


- 


X 


- 


_ 


_ 


_ 










393 


_ 




_ 


_ 


X 


_ 
















195 


X 








X 


















263 










X 








X 










41 




X 






X 


















303 










X 




X 














223 






X 




X 


















85 


X 








X 


















395 










X 






- 












129 


X 






- 


X 


- 








OS-ASS-6-2, 

MAS24-2, 

ZM-ID-5-2,. 

ZM-M01ST-5-4, 

ZM-MOIST-9-2, 

ZM-PC-5-1, 

ZM-PC-9-1, 

ZM-STC-5-1, 

2M-DMC-5-1, 

ZM-GYLD-5-2 

ZM-GYUl-9-1, 

ZM-GYUI-9-1, 

ZM-GYUI-9-2, 

ZM-GYUP-9-1, 

ZM-GYUP-9-2, 

ZM-KW30O-9-1, 

ZM-KW300-9-2 








103 | 


X 








X 
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51 


- 


X 


_ 




X 


_ 


_ 




_ 










99 


- 


- 


- 




X 




_ 


_ 


_ 










69 


X 


- 






X 


_ 


_ 




_ 










397 


_ 


_ 


_ 


_ 


X 








_ 










229 


_ 




X 


_ 


X 


_ 


_ 














399 


_ 


- 


_ 




X 


_ 


_ 




_ 










241 


_ 




X 




X 






_ 












91 


X 


_ 


_ 


_ 


X 




_ 




_ 










401 


_ 


_ 






X 


_ 


_ 


_ 


_ 










121 


X 






_ 


X 




_ 


_ 


_ 










403 


_ 


_ 


_ 




X 




_ 


_ 












187 


X 


m 


_ 


_ 


X 


















405 


_ 


_ 


_ 




X 


















13 




m 




X 


X 


















243 






X 


. 


X 


















203 


X 








X 












441 


455 


CGG 


407 










X 


















409 










X 


















411 










X 












243 


260 


CAG 


105 


X 








X 


















107 


X 








X 












235 


255 


GAG 


115 


X 








X 












1449 


1463 


CGG 


15 




- 


- 


X 


X 


- 


- 


- 


- 










165 


X 








X 


















123 


X 








X 


















205 


X 








X 


















63 




X 






X 


















413 










X 












146 


160 


CGG 


209 


X 


m 






X 






_ 


_ 










323 


m 


m 






X 




X 


_ 


_ 




129 

368 


143 

385 


CGG 
CCG 


77 


X 


_ 


_ 




X 


_ 


_ 




_ 










415 


_ 


_ 


_ 




X 


- 


_ 




_ 










141 


X 


- 


- 


- 


X 


- 


- 


- 


- 




128 


148 


CCT 


27 










X 


X 
















65 




X 






X 


















185 


X 








X 


















299 










X 






X 






5 


22 


CGG 


67 




X 






X 


















17 








X 


X 


















279 










X 








X 
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71 


X 


- 


- 


- 


X 


- 


- 


- 


- 










207 


X 


- 


- 


- 


X 


- 


- 


- 


- 




8 


25 


CCG 


417 


- 


- 


- 


- 


X 


- 


- 


- 


- 










127 


X 


- 


- 


- 


X 


- 


- 


- 


- 










125 


X 


- 


- 


- 


X 


- 


- 


- 


- 










117 


X 


- 


- 


- 


X 




- 


- 


- 










183 


X 


- 


- 


- 


X 


- 


- 


- 


- 










419 


- 


- 


- 


- 


X 


- 


- 


- 


- 










421 


- 


- 


- 


- 


X 


- 


- 


- 


- 










29 


- 


- 


- 


- 


X 


X 


- 


- 


- 










297 


- 


- 


- 


- 


X 


- 


- 


X 


- 










423 










X 












921 


936 


AG 


425 


- 


- 




- 


X 


- 


- 


- 


- 










245 






X 




X 


















427 


- 


- 


- 


- 


X 


- 


- 


- 


- 










429 


- 


- 




- 


X 


- 


- 


- 


- 










247 






X 




X 


















249 


- 


_ 


X 


_ 


X 


- 


- 


_ 


_ 










159/ 

171 

X 


- 




X 




_ 


_ 


_ 














31 




X 






X 


















275 










X 








X 




217 
753 


234 
767 


GGC 
CGG 


19 










X 


X 
















151 


X 








X 


















213/ 
227- 


X 




X 


- 






1 






OS-FLLEN-9-1, 

OS-GW100-4-1, 

MAS24-2, 

MAS24-3, 

ZM-CPC-1-4, 

ZM-CPC-1-6, 

ZM-CPC-10-1, 

ZM-IVDOM-10- 

1, 

ZM-IVDOM-10- 
2, 

ZM-MOIST-1-1, 
ZM-MOIST-9-2, 
ZM-PC-9-1, 


339 
434 


353 
448 


GTC 
AGC 
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ZM-STC-10-2, 

ZM-STC-2-2, 

ZM-DMC-1-2, 

ZM-DMY-l-3, 

ZM-DMY-1-5, 

ZM-DMY-2-4, 

ZM-GYHA-1-2, 

7M.OVI IT 0 1 

ZM-GYUI-9-2 

ZM-GYUP-9-1 

ZM-GYUP-9-2, 

ZM-H-1-1, 

2M-KW30O-9-1, 

ZM-KW30Q-9-2 








237 




- 


X 


- 


X 


- 




- 


- 










133 


X 








X 




X 














239 






X 




X 


















161 


X 








X 


















61 


X 








X 


















47 




X 






X 


















219 






X 




X 


















259/ 
271- 


— 


_ 


X 


_ 






X 














93 


X 








X 










OS-AE-12-1 








111 


X 








X 












275 


289 


GCG 


73 


X 








X 












54 


74 


CGG 


235 






X 




X 


















217 






X 




X 


















257 










X 








X 











- 147- 



201 



OS-AMY-6-1, 

OS-AMY-6-2, 

OS-ASS-6-1, 

OS-GC-6-1, 

OS-BDV-6-1, 

OS-CHALK-6-1, 

OS-CPV-6-1, 

OS-CPV-6-2, 

OS-CSV-6-1, 

OS-CSV-6-2, 

OS-HPV-6-1, 

OS-HPV-6-2, 

OS-SBV-6-1, 

OS-WC-6-1, 

OS-DM-6-1, 

OS-GP-6-1, 

OS-Y-6-1, 

MAS24-2, 

2M-CPC-6-2, 

ZM-ID-10-1, 

2M-MOIST-10-1, 

ZM-MOIST-9-2, 

ZM-MOIST-9-2, 

ZM-FC-9-1, 

ZM-STC-10-1, 

ZM-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-6-1, 

ZM-DMC-6-2, 

ZM-DMY-10-1, 

2M-GWM2-10-1, 

ZM-GYLD-6-1, 

ZM-GYU>6-4, 

ZM-GYLD-6-4, 

ZM-GYLD-9-1, 

ZM-GYUI-9-1, 

ZM-GYUI-9-2, 

ZM-GYUP-9-2, 

ZM-HI-10-1, 

ZM-KW10O-9-1, 

ZM-KW300-9-1, 

ZM-KW300-9-2, 

ZM-TW-10-2, 

ZM-YLD-6-1, 

ZM-YLD-9-1 
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281 










X 






X 












251 










X 








X 










3 








X 


X 










OS-AE-11-1, 
ZM-MOT^T-1 /> 

ZM-MOIST-S-1 

ZM-PC-1-2, 

ZM-GWM2-1-1, 

ZM-GYHA-5-1, 

ZM-GYLD-5-3, 

ZM-HI-1-2, 

ZM-KW100-1-2 


24 


38 


CGC 


21 


- 


- 


- 


- 


X 


X 


- 


- 


- 


OS-AE-12-1 








179 


X 








X 


















319 


X 








X 




X 








41 


55 


CCG 


7 








X 


X 


















291 










X 






X 






10 


24 


GAG 


169 


X 








X 


















83 


X 


- 






X 


















269 






- 




X 








X 










9 








X 


X 


- 








0S-GPL4-1, 

OS-GPP-4-1, 

0S-GYLD4-1, 

MAS24-2, 

MAS24-28, 

ZM-CPC-3-2, 

ZM-ID-10-1, 

ZM-ID-2-1, 

ZM-MOIST-10- 

1, 

ZM-MOIST-2-2, 

ZM-MOIST-3-2, 

ZM-MOIST-4-3, 

ZM-MOIST-5-3, 

ZM-MOIST-9-2, 

ZM-PC-9-1, 

ZM-STC-10-1, 

ZM-BI0M3-1, 

ZM-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-2-3, 
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ZM-DMY-10-1, 

ZM-DMY-3-1, 

ZM-DMY-4-3, 

ZM-EWT-2-1, 

ZM-GWM2-10- 

1, 

ZM-GWM2-3-2, 

ZM-GYHA-3-1, 

ZM-GYLD-2-2, 

ZM-GYLD-3-3, 

ZM-GYLD-5-2, 

ZM-GYUI-9-1, 

7\A fTVT TT Q 9 

VM.nVT IP-Q-9 
^jvi-vj i \jr y £y 

ZM-HI-1G-1 

ZM-KW300-3-3, 

ZM-KW30O-9-1, 

ZM-KW30O-9-2, 

ZM-TW-10-2, 

2M-TW-2-3 








449 










X 








X 










277 










X 








X 




664 


681 


ACT 


285 


- 


- 


- 


- 


X 


- 


- 


X 




OS-PGWC-8-1, 

OS-FLWID-3-1, 

OS-GPP-8-2, 

SMS015-9, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-IVDOM-1-2, 

ZM-1VDOM-1-3, 

ZM-MOIST-1-3, 

ZM-MOIST-1-4, 

ZM-MOIST-4-2, 

ZM-MOIST-4-3, 

ZM-PC-1-1, 

ZM-DMC-1-1, 

2M-DMY-1-2, 

ZM-DMY-1-4, 

ZM-DMY-4-3, 

ZM-GYHA-1-1, 

ZM-GYLD-1-2, 

ZM-GYUP-1-2, 

M-TW-1-1 
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325 










X 




X 






OS-PGWC-8-1, 

OS-FLWn>3-l, 

OS-GPL8-2, 

OS-GPP-8-2, 

OS-GYLD-8-2, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-IVDOM-1-2, 

ZM-IVDOM-1-3, 

ZM-MOIST-1-3, 

ZM-MOIST-1-4, 

7M-PC-1-1 

ZM-DMC-1-1, 

ZM-DMY-1-2, 

ZM-DMY-1-4, 

ZM-GYHA-1-1, 

ZM-GYLD-1-2, 

ZM-GYUP-1-2, 

ZM-TW-1-1 








265 


- 


- 


- 


- 


X 


- 


- 


- 


X 


OS-FLLEN-3-1, 

OS-GPL2-1, 

OS-GYLD-2-1, 

MAS24-21, 

ZM-ID-5-2, 

ZM-MOIST-4-3, 

ZM-M01ST-4-4, 

ZM-M01ST-5-4, 

2M-PC-5-1, 

2M-STC-5-1, 

ZM-DMC-5-1, 

ZM-DMY-4-2, 

ZM-DMY-4-3, 

ZM-DMY-4-4, 

ZM-EWT-4-2 

ZM-GYLEM-1, 

ZM-GYLD-5-2, 

ZM-HI-4-1, 

ZM-KNE-4-1, 

ZM-KW30O-4-2, 

ZM-KW&4-1, 

M-TGW-4-1 


65 


79 


CGG 


327 










X 




x 1 















- 151 - 
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Oil 






A. 




A 










TKA \ /fOTCT 1 1 

ZiVI-IYlvJlo I -^J, 

ZM-STC-2-2, 

ZM-DMY-2-3, 

2M-DMY-2-4, 

ZM-DMY-4-3, 

M-GY1JD-2-3 








37 




X 


- 


- 


X 




- 














43 




X 






X 










ZM-DMY-4-1 








293 


- 








X 


- 




X 




OS-CIF-6-1, 

MAS13-32, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-IVDOM-1-2, 

ZM-MOIST-1-4, 

ZM-MOIST-2-1, 

ZM-MOIST-9-2, 

ZM-PC-1-1, 

ZM-DMC-1-1, 

ZM-DMY-1-4, 

ZM-DMY-2-1, 

ZM-GYLD-2-4, 

ZM-GYLD-9-1, 

ZM-GYUP-1-2, 

ZM-KW100-9-1, 

ZM-KW300-9-2, 

ZM-TW-1-1, 

ZJVj-YLD-y-1 








11 1 


Y 
A 








A 




A 






ZM-CrU-o-Z, 

ZM-DMC-6-1, 

ZM-DMC-6-2, 

ZiVl-O I LLrO 1 , 

ZM-GYLD-6-4, 
ZM-YLD-6-1 


536 


550 




79 


X 








X 










OS-AMY-5-1 








211 






X 




X 










OS-APDF-9-1, 

OS-VGT-9-1, 

OS-GW-9-1 








177 


X 








X 










OS-CIF-6-1 


44 j 
117 


58 
131 


CGT 
GGA 
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Table 3 : Table 3 provides a further subset of rice genes the expression of which is up- regulated 
during grain filling. 

Further identified are SSR sequences in the coding region of the rice genes. 



A = structural protein 



B = hypothetical/unknown proteins 



C - Growth/division and development 



D = classification not clear 



E = CereaLGrain_Filling_QTLs (a description of the respective QTLs is provided in 
Table . . . below) 



F = Beginning of the SSR 



G = End of the SSR 



H = Nucleotide Sequence of the trinucleotide repeat unit 



SEQ1D 


A 


B 


C 


D 


E 


F 


G 


H 


329 




X 














331 








X 










332 


X 
















333 




X 














334 




X 














335 




X 














343 








X 










23 




X 














345 




X 














351 




X 














355 




X 














357 




X 














361 




X 














363 




X 














365 








X 










369 








X 










371 






X 












373 




X 














313 




X 














375 




X 














377 




X 














379 




X 














381 








X 
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383 


- 


X 


- 


- 










387 


- 


X 


- 


- 










389 




X 














393 




X 














395 




X 














99 








X 










397 




X 














229 




X 


- 












403/431- 


- 










16 


39 


CCG 


433 




X 






OS-AMY-5-1, 

MAS13-31, 

SMS021-80, 

ZM-CPC-5-l, 

ZM-CPC-7-2, 

ZM-IVDOM-5-1, 

ZM-IVDOM-5-2, 

ZM-MOIST-5-2, 

ZM-MOIST-5-2, 

ZM-MOIST-5-3, 

ZM-MOIST-7-1, 

ZM-BIOM5-1, 

7U RinM.7 1 
ZiVl-IJlwlVr- /-I, 

7M-DMY-5-1 

ZM-GWM2-7-1 

ZM-GYLD-5-1, 

ZM-GYLD-5-3, 

ZM-M-7-1, 

ZM-KW300-5-1, 

ZM-TW-5-1 








435 








X 










437 




X 














439 




X 






OS-YLD-3-2, 
ZM-1D-5-1, 
ZM-IVDOM5-3, 
ZM-GYLD-5-3 









- 154- 



WO 03/000905 



PCMB02/02450 



441 








X 


OS-REGEN-5-1, 

MASJ2-18, 

MAS24-16, 

SMS015-16, 

SMS021-81, 

ZM-ID-6-1, 

ZM-ID-6-1, 

ZM-ID-8-1, 

ZM-ID8-1, 

ZM-ED-8-1, 

ZM-MOIST-5-1, 

ZM-MOIST-6-2, 

ZM-PC-8-1, 

ZM-STC-6-1, 

ZM-STC-8-1, 

ZM-VT-6-1, 

ZM-BIOM-8-1, 

ZM-DMC-8-1, 

ZM-DMY-8-1, 

ZM-DMY-8-2, 

ZM-GYHA-5-1, 

ZM-GYHA-6-1, 

ZM-GYLD-5-3, 

ZM-GYLD6-2, 

ZM-GYLD6-3, 

ZM-HI-8-1, 

ZM-KW30O-6-2 


1912 


1929 


CGG 


443 










OS-RGT-12-2, 
OS-GWPL-12-1 


117 
1962 


131 
1979 


CGG 
CGG 


445 




X 














447 




X 






OS-YLD-3-2 
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95 








X 


OS-CIF-8-1, 

OS-GW-8-1, 

MAS13-24, 

ZM-CPC-1-3, 

ZM-CPC-1-5, 

ZM-IVDOM-1-2, 

ZM-IVDOM-1-4, 

ZM-MOIST-1-4, 

ZM-MOIST-1-5, 

ZM-PC-1-1, 

ZM-DMC-1-1, 

ZM-DMC-6-2, 

ZM-DMY-1-4, 

ZM-GYLD-6-4, 

ZM-GYUP-1-2, 

ZM-KW300-1-2 

ZM-TW-1-1, 

ZM-YLD-6-1 








451 




X 






OS-PGWC-12-1, 

OS-BDV-12-1, 

OS-PKV-12-1 


962 


976 


GCA 


453 


- 


X 








27 
344 


47 
358 


CCT 
GCG 


455 




X 


- 


- 


MAS24-28, 

ZM-ID-10-1, 

ZM-1D-2-1, 

ZM-MOIST-10-1, 

ZM-MOIST-2-2, 

2M-MOIST-4-3, 

ZM-MOIST-5-3, 

ZM-STC-10-1, 

ZM-DMC-10-1, 

ZM-DMC-10-2, 

ZM-DMC-2-3, 

2M-DMY-10-], 

ZM-EMY-4-3, 

ZM-EWT-2-1, 

ZM-GWM2-10-1, 

ZM-GYLD-2-2, 

ZM-GYLD-5-2, 

ZM-M-10-1, 

ZM-TW-10-2, 

ZM-TW-2-3 









- 156- 
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457 




X 














459 




X 






OS-PGWC-12-1, 
OS-BDV-12-1, 

Uo-iN-V-u-J 


53 


73 


CGG 


461 




X 






OS-GW-11-1, 

ZM-1VDOM-9-1, 

ZM-IVDOM-9-2, 

ZM-GYLD-9-2, 

ZM-KW 100-9-1, 

ZM-TGW-9-2 



























Table 4 : Genes involved in rice grain filling, which belong to the functional category of stress 
response proteins 



Rice 
(SEQ 
ID NO) 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


1 




1065 


1182 


Similar to MPV1_HUMAN P39210 HOMO 
SAPIENS (HUMAN). MP VI 7 PROTEIN. 


3 






1115 


Similar to ANRX_ANASP Q44141 ANABAENA SP. 
(STRAIN PCC 7120). ANAREDOXIN. 


5 


939 


1030 


1184 


Similar to gi|20286|emb|CAA46916. 1 1 peroxidase 
[Oryza sativa] 


7 


935 


1037 




Similar to gi|1620753|gb|AAB17095.1| proteinase 
inhibitor [Oryza sativa] 


9 


934 


1011 


1110 


Similar to gi|3287683|gb|AAC2551 1 . 1 1 Similar to 
apoptosis protein MA-3 gb|D50465 from Mus 
musculus. [Arabidopsis thaliana] 


11 




952 


1198 


Similar to gi|5725430|emb|CAB52439.1| stress 
responsive protein homolog [Arabidopsis thaliana] 


13 




998 


1175 




15 




1015 


1167 




17 


899 


1042 


1161 





5 



Table 5 : Genes involved in rice grain filling, which belong to the functional categoiy of signaling 
molecules 



Rice 


Banana 


Wheat 


Maize 


Gene Description 


(SEQ 


(SEQ ID 


(SEQ ID 


(SEQ ID 
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ID NO) 


NO) 


NO) 


NO) 




19 


- 


1089 


- 


Similar to gi|1352683|sp|P49599|P2C3_ARATH 
PROTEIN PHOSPHATASE 2C PPH1 (PP2C) 


21 


- 


971 


- 


Similar to gi|7269803|emb|CAB79663. 1 1 
serine/threonine-specific kinase like protein [Arabidopsis 
thaliana] 


23 








Similar to gi|6520139|dbj|BAA87936.1| ZW9 
[Arabidopsis thaliana] 


25 




1071 


1120 


Similar to gi|9293975|dbj|BAB01878.1| receptor 
protein kinase [Arabidopsis thaliana] 


27 


916 


1049 






29 




984 


1186 





Table 6 : Genes involved in rice grain filling, which belong to the functional category of 
transmembrane proteins 



Rice 
(SEQ 
ID NO) 


Banana 
(SEQ 
ID NO) 


Wheat 
(SEQ 
ID NO) 


Banana 
(SEQ 
ID NO) 


Gene Description 


31 




1025 




(nitrite transporter) 


33 




1047 




(amino a selective channel protein) 


35 


950 


959 


1164 


(G6P transporter in plastids) 


37 








(PTR2 POT family) 


39 


949 


1017 




(Leucine rich protein) 


41 


927 


962 


1112 


(immunoglobulin) 


43 


917 


982 


1109 


(dehydrogenase) 


45 




954 


1117 


(putative transport protein) 


47 


921 


1099 


1152 


(phosphate transporter) 


49 


891 


1040 


1128 


(monosaccarid (hexose) transporter) 


51 




994 




(PTR2 POT family) 


53 




1067 


1159 


(cation transporter protein Ec) 


55 




1047 




(amino a selective channel protein) 


57 








(sugar transporter) 


59 1 




1077 




(transporter protein ) 


61 




1085 




Similarity[ab043024_34- 1656 /codon_start=l 
/db_xref="gi:8051712" /product="sodium sulfate or 
dicarboxylate transporter" /protein_id="baa96091.1" ] 
Evidencef 100% (1510/1510)] 


63 




1105 




Similar to gi|7523692|gb|AAF63 131.1 |AC0 1 1 00 1 _1 
Putative chloroplast inner envelope protein [Arabidopsis 



- 158- 



WO 03/000905 



PCT/IB02/02450 











thaliana] 


65 




957 


1114 


Similar to PITH_STRHA P41 1 32 STREPTOMYCES 
HALSTED1I. PUTATIVE LOW- AFFINITY 
INORGANIC PHOSPHATE TRANSPORTER 
(FRAGMENT) 


67 


944 


1075 




Similar to PTR2_YEAST P32901 
SACCHAROMYCES CEREVISIAE (BAKER S 
YEAST). PEPTIDE TRANSPORTER PTR2 
(PEPTIDE PERMEASE PTR2). 



Table 7 : Genes involved in rice grain filling, which belong to the functional category of carbohydrate 
metabolism 



STARCH METABOLISM 


Branch. 


uig Enzyme 


Rice 
(SEQ 
ID NO) 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


69 


888 


1058 




Similar to GLGB.ORYSA Q01401 ORYZA 
SATIVA (RICE). 1,4-ALPHA-GLUCAN 
BRANCHING ENZYME (EC 2.4. 1.18) (STARCH 
BRANCHINGENZYME) (Q-ENZYME). 


71 




1026 


1157 


Similar to gi|4584507|emb|CAB40745.1| starch 
branching enzyme II [Solarium tuberosum] 


73 




1018 


1157 


ei|3851526|gb|AAC72335.1| starch branching enzvme 
Ha [Hordeum vulgare 


Debranching Eng 


nie 


75 




987 




gill 7S3306ldbi|BAA091 67. 1 1 starch debranchine 
enzyme precursor [Oryza satival 


77 




966 




Similar to gi|3252794|dbj|BAA2904 1 . 1 1 isoamylase 
[Oryza sativa] 


Starch degradation 


Alpha - Amylases 


79 


909 


1083 


1173 


Similar to AMYM_BACST PI 953 1 BACILLUS 
STEAROTH ERMOPHILUS. MALTOGEN1C 
ALPHA- AMYLASE PRECURSOR (EC 3.2.1 .133) 
(GLUCAN 1 ,4-ALPHA-MALTOHYDROLASE) 


81 


887 


1035 


1150 


Similar to gi|426482| Alpha-amylase 


83 


887 


1033 


1150 


|CAA39777.1| Alpha- amylase 


85 




1033 


1150 


|CAA39777.1| Alpha- amylase 


87 


887 


1033 


1151 


|PF00128| Alpha-amylase 



- 159- 



WO 03/000905 



PCT7IB02/02450 



89; 
509 


887 


1032 


1150 


gi|426482|aaa501 61 . 1 1 Alpha-amylase 


91 




1034 


1150 


gi|l 13766|sp|P17654|AMYl_ORYSA ALPHA- 
AMYLASE PRECURSOR (1 ,4-ALPHA-D- 
GLUCAN GLUCANOHYDROLASE) (ISOZYME 
IB) 


alpha-Amylase Inhibitor 


01 










95 








Motifs{ Cereal JTryp_Amyl_Inh Cereal trypsin/alpha- 

amylase inhibitors family; 

Pfam6_l |PF00234|tryp_alpha_amyl Protease 

inhibitor/seed storage family} Evidence[100% 

(474/474)] 


97 








Motifs{ Aldehyde_Dehydr_Cys Aldehyde 
dehydrogenases active sites; Cereal_Tryp_Amyl_Inh 
Cereal trypsin/alpha-amylase inhibitors family} 
Evidence[99% (486/489)1 


99 








Motifs{Cereal_Tryp_Amyl_Inh Cereal trypsin/alpha- 
amylase inhibitors family; 
Pfam6_l |PF00234|tiyp_alpha_amyl Protease 
inhibitor/seed storage family} Evidence[100% 
(501/501)] 


Beta-Amylase 


101 




965 


1107 


Similarity[yl6242_l- 1 798 /codon_start=2 
/db_xref="gi:4 138596" /partial=true /product="beta- 
amylase"/protein_id="caa76131.r ] Evidence[100% 
(931/931)]. 


103 


926 


956 


1156 


Similarity[z25871_48- 1514 /codon_start=l 
/dbjaef i ="swiss-prot:p55005" /ec_number="3.2. 1 .2" 
/product="beta-amylase" /protein_id="caa81091.1" ] 
Evidence[100% (1539/1539)1 


105 




955 




gi| 1 703302|sp|P55005|AMYB_MAIZE BETA- 
AMYLASE (1,4-ALPHA-D-GLUCAN 
MALTOHYDROLASE) 


107 




965 


1106 


gi|3334120|sp|P93594|AMYB_WHEAT BETA- 
AMYLASE (1,4-ALPHA-D-GLUCAN 
MALTOHYDROLASE) 


Pullulanase 


109 




987 




Similarity [abO 1 29 1 5_2206- 1 4924 /codon_start= 1 
/db_xref="gi:31 72048" /product="starch debranching 
enzyme" /protein_id="baa28632. 1 " /note="pullulanase" 
] Evidencef 100% (3079/3079)] 



- 160- 



WO 03/000905 



PCT/TO02/02450 





887 1032 1150 


VjIUCOSJ 


dase 


ill 




1005 




Similar to AMY G_NEUCR PI 4804 NEUROSPORA 
CRASSA. GLUCOAMYLASE PRECURSOR (EC 
3.2.1.3) (GLUCAN 1,4- ALPHA- GLUCOSIDASE)- 
( 1 ,4- ALPHA-D-GLUCAN GLUCOHYDROLASE). 


113 


905 


1021 




|CAA04707.1| Alpha-glucosidase 


115 




J 086 


1144 


gi|3023275|sp|Q43763|AGLU_HORVUALPHA- 
GLUCOSIDASE PRECURSOR (MALTASE) 


117 








gi|544 1 5 1 |sp|Q99040|DEXB_STRMU GLUCAN 
1 ,6- ALPHA-GLUCOSIDASE (DEXTRAN 
GLUCOSIDASE) (EXO- 1,6- ALPHA- 
GLUCOSIDASE) (GLUCODEXTRANASE) 


Surose Synthase 


119 


932 


1006 


1148 


Similar to SUS2_ARATH Q00917 ARABIDOPSIS 
THALIANA (MOUSE- EAR CRESS). SUCROSE 
SYNTHASE (EC 2.4. 1 . 1 3) (SUCROSE-UDP 
GLUCOSYLTRANSFERASE). 


121 


930 


1022 


1170 


sil283009lDir|IS22535 sucrose synthase (EC 2.4.1.13) 
1 - rice (fragment) 


123 


930 


1028 


1170 


ai|20366|emb|CAA4601 7. 1 1 sucrose synthase fOryza 
sativa] 


125 


930 


1054 


1170 


cil267055|sp|O0091 7|SUS2_ARATH SUCROSE 
SYNTHASE (SUCROSE-UDP 
GLUCOSYLTRANSFERASE) 


127 


930 


1054 


1191 


eil66572|pir||YUMU sucrose synthase (EC 2.4. 1.13)- 
Arabidopsis thaliana 


Starch Synthase 


129 




1066 




Similar to UGS3_SOLTU Q43847 SOLANUM 
TUBEROSUM (POTATO). GLYCOGEN 
(STARCH) SYNTHASE PRECURSOR (EC 
2.4.1 .1 1) (GBSSII) (GRANULE-BOUND STARCH 
SYNTHASE 11) (FRAGMENT) 


131 


924 


1070 


1125 


Similar to gi|3057 1 22|gb| AAC 1 40 1 5. 1 1 starch synthase 
DULLI [Zea mays 


133 


947 


1055 


1155 


Similar to gi|5257102|gb|AAD4l242.1| granule bound 
starch synthase [Oryza sativa subsp. japonica] 


ADPG 


pyrophosphorylase 


135 




989 


1193 


Similar to gi|3093462|gb| AAC 1 5247. 1 1 ADP-glucose 
pyrophosphorylase large subunit [Oryza sativa] 


137 


922 


1098 




Similarity[ay028315_l 15-1617 /codon_start=l 
/db_xref="gi: 1 3508485" /product="adp-glucose 



- 161 - 



WO 03/000905 



PCT/IB02/02450 











pyrophosphorylase small subunit" 
/protein_id= M aak273 13.1" /note="putative amyloplast 
form 0 ] Evidence[ 1 00% ( 1 520/1 520)1 


139 




989 


1 193 


Similarity[ac007858 - 66917-70303/codon_start=l 
/db_xref="gi:5091 608" /evidence="noLexperimental" 
/gene=" 1 0a 1 9i. 1 2" /protein Jd="aad39597. 1 " 
/note- 'identical to gb|d50317 adp glucose 
pyrophosphorylase large subunit from oiyza sativa. 
ests dbj|d22125 and dbj|d!5718 come from" ] 
Evidence[ 1 00% (1615/1615)] Gene[ 1 OA 1 91. 1 2 
Identical to gb|D50317 ADP glucose 
pyrophosphorylase large subunit from Oryza sativa. 
ESTs dbj|D22125 and dbj|D15718 come from] 


141 


922 


1098 


1193 


Similar to gi|169759|gb|AAA33890.1 1 ADP-glucose 
pyrophosphorylase 5 lkD subunit (EC 2.7.7.27) 


Triosephosphate Isomerase 


143 


912 


1046 


1133 


Similarity[z32521__64-960 /codon_start=l 
/db_xref="swiss-prot:p46225 M /ec_number="53. 1.1" 
/product="triosephosphate isomerase" 
/protein_id="caa83533.1" ] Evidence[100% 
(822/822)] 


145 


912 


1046 


1133 


db_xref="swiss-prot:p46225" /ecnumber="5.3. 1.1" 
/product="triosephosphate isomerase" 
/protein_id="caa83533. 1 " ] Evidence[l 00% 
(822/822)] 


147 


890 


1003 


1134 


Similarity[j04121_>762 /codon_start=l 
/db_xref= M gi:556171" /product="triosephosphate 
isomerase" /proteinjd="aab62730.1" ] 
Evidence[100% (683/683)] 


Other proteins involved in starch metabolism j 


149 


936 


1043 


1194 


Similarity[x53 1 30_5 1-1127 /codon_start= 1 
/db_xref^"swiss-prot:pl 7784" 
/protein_id="caa37290. 1 " /note="fructose- diphosphate 
aldolase (aa 1-358)" ] Evidence[100% (1078/1078)1 


151 




963 


1124 


|AAA45939.1 1 Alpha- 1 ,4-glucan phosphorylase h 

isozyme 


153 


950 


959 


1164 


Similarity[af0208 1 3_273- 1436 /codon_start= 1 
/db_xre£="gi:2997589" /function="mediates the antiport 
of gIucose-6-phosphateagainst phosphate in plastids of 
heterotrophic tissues" /gene="gpt" /product="glucose- 
6-phosphate/phosphate-translocator precursor" 
/proteinjd="aac08524.1" 


155; 


913 




1154 


ei|4539316|emb|CAB388 17.1| putative fructose- 



- 162- 



WO 03/000905 



PCT/IB02/02450 



507 








bisphosphate aldolase [Aiabidopsis thaliana] 


157 




1069 




Motifs{Pfam6_l |PF00702|Hydrolase haloacid 
dehalogenase-like hydrolase} Evidence[82% 

yl \JJ£J l Z-j 1 *)} \ 


159 








Similarity[ul7225_40- 1 743 /codon_start=l 
/db_xre£="gi:596023" /ec_number="5.3. 1 .9" 
/gene="phil" /product="glucose-6 phosphate 
jsornerase /proiein_ja— aaao/ / jha 
/note="phosphohexose isomerase" ] Evidence[100% 
(1889/1889)] Genefphil 5.3.1.9 glucose-6 phosphate 
isomerase phosphohexose isomerase] 


161 


946 


1103 


1189 


Similarity[ab013353_89- 1504 /codon_start=l 
/db_xret= gi:31U7931 /pioduct= udp- glucose 
pyrophosphorylase" /pir>tein_id="baa25917.r ] 
Evidence[100% (1582/1582)] 


163 


937 


970 


1153 


Similarity [af372833_47- 1 273 /codon_start=l 
/db_xref^"gi: 13991929" 
/product="phosphoenolpyruvate/phosphate 
translocator /protein_id= aak5 1561.1 /note= ppt ] 
Evidence[100% (1239/1239)] 


165 


892 


964 


1179 


Similar to gi|5231 1 19|gb|AAD4 1079.1 |AF143202_1 
starch phosphorylase L [Solanum tuberosum]; 
gi| 1 301 72|sp|P27598|PHSL_IPOBA ALPHA- 1 ,4 
GLUCAN PHOSPHORYLASE, L ISOZYME, 
CHLOROPLAST PRECURSOR (STARCH 
PHOSPHORYLASE L) 


167 


902 


997 




Motifs{ Pfam6_l |PF01 59 1 |6PF2K 6-phosphofhicto- 
2-kinase; Atp_Gtp_A ATP/GTP-binding site motif A 
(P-Ioop)} Evidence[71% (2205/3069)] 


169 


946 


1050 




SimilarityfapOOl 383_68 1 7 1 -73040 /codon_start=l 
/db_xref="gi:724291 1" /proteinjd="baa92509.1 " 
/noteF= M similar to udp- glucose pyrophosphorylase. 
(x91347)" ] Evidence[100% (1528/1528) 


171 




1023 




Similarity[ul7225_40- 1 743 /codon_start=l 
/db_xref="gi:596023" /ec_numben="5.3. 1 .9" 
/gene= t, phil" /product="g!ucose-6 phosphate 
isomerase" /protein Jd= M aaa82734. 1 " 
/note="phosphohexose isomerase" ] Evidence[100% 
(1889/1889)] Genefphil 5.3.1.9 glucose-6 phosphate 
isomerase phosphohexose isomerase) 



- 163- 



WO 03/000905 



PCT/IB02/02450 



173 




975 




Similarity[d452 1 8_54- 1 760 /codon_start= 1 
/db_xref="gi:639686"/product="phosphoglucose 
isomerase (pgi-b)" /protein_id="baa08149. 1 " ] 
Evidence[100% (1409/1409)] 


175 


937 


970 


1153 


Similarity[af372833_47- 1 273 /codon_start= 1 
/db_xref="gi: 13991929" 
/product="phosphoenolpymvate/phosphate 
translocator" /protein_id="aak51561.1" /note="ppt" ] 
Evidencef 1 00% ( 1 050/1 050)1 


177 


889 


1081 


1196 


Motifs{Pfam6_l |PF00274|glycolytic_enzy Fructose- 
bisphosphate aldolase class-I; AldoIase_Class_I 
Fructose-bisphosphate aldolase class-I active site) 
Evidence[65% (1082/1650)] 


179 


- 


977 


1180 


Similarity[z32850_352-4957 /codon_start= 1 
/db_xrei^"swiss-prot:q41 141" 
/product="pyrophosphate-dependent 
phosphofructokinasebetasubunit" 
/protein_id="caa83683. 1 " ] Evidence[ 1 00% 
(1698/1698)] 


181 


892 


964 


1179 


Similanty[af095521_76- 1923 /codon_start=l 
/db_xref="gi:37901 02" /ec_numbei="2.7. 1 .90" 
/gene="ppi-pfka" /product="pyrophosphate-dependent 
phosphofructokinasealpha subunit" 
/protein_!d="aac67587.1" ] Evidence[100% 
(1939/1939)] Gene[PPi-PFKa 2.7.1.90 
pyrophosphate-dependentphosphofhictokinasealpha 
subunit] 


183 


906 


988 


1113 ! 


gi|3 1 22594|sp|Q591 26|PFP_AMYME 
PYROPHOSPHATE— FRUCTOSE 6-PHOSPHATE 
1 -PHOSPHOTRANSFERASE (6- 
PHOSPHOFRUCTOKINASE 
(PYROPHOSPHATE)) (PYROPHOSPHATE- 
DEPENDENT 6-PHOSPHOFRUCTOSE-l- 
KJNASE> fPPI-PFIO 

ivji i/iiJXjy \i i i 11 -rv. j 


185 


896 


1014 


1180 


gi|2499488|sp|Q41 140|PFPA_RICCO 
PYROPHOSPHATE--FRUCTOSE 6-PHOSPHATE 
1 -PHOSPHOTRANSFERASE ALPHA SUBUNIT 
(PFP) (6-PHOSPHOFRUCTOKINASE 
(PYROPHOSPHATE)) (PYROPHOSPHATE- 
DEPENDENT 6-PHOSPHOFRUCTOSE- 1 - 
KINASE) (PPI-PFK) 



- 164- 



WO 03/000905 



PCT/IB02/02450 



187; 
511 


911 




1138 


gi|39 1 364 1 |sp|064422|F 1 6P_ORYS A FRUCTOSEr 
1,6-BISPHOSPHATASE, CHLOROPLAST 
PRECURSOR (D-FRUCTOSE- 1 ,6- 
BISPHOSPHATE 1 -PHOSPHOHYDROLASE) 
(FBPASE) 


Non-Starch Carbohydrate 


Metabolism 


189 


912 


1046 


1133 


Similarity[z3252 1_64-960 /codon_start=l 
/db_xref="swiss-prot:p46225" /ec_number="5.3. 1.1" 
/product="triosephosphate isomerase" 
/protein_ia= caae.5 533.1 J fcvidence[luu% 
(822/822)] 


191; 
503 




1052 


1 121 


Similar to gi |92945 1 6|dbj |B AB02778. 1 1 contains 
similarity to endo- 1,3-1 ,4-beta-D- 
glucanase~gene_id:MDB19.8 [Arabidopsis thaliana] 


193 








Similar to PTSN_ECOLl P31222 ESCHERICFUA 
COLI. NITROGEN REGULATORY IIA PROTEIN 
(EC 2.7.1.69) (ENZYME IIA- 
N I KX^HUoFHUTl^NbrhRASh bNZYME II, A 
COMPONENT); Motifs {Cytochrome_C Cytochrome 
c family hone-binding site; Zinc_Finger_C2h2J Zinc 
ringer, L-zriz type, aomain, £]nc_ringer__czriz_i z,mc 
finger, C2H2 type, domain; Zinc_Finger_C2h2_l Zinc 
finger, C2H2 type, domain} Evidence[0% (0/2145)] 


195 




1041 


1137 


Similar to gi|6714431|gb|AAF261 1 9. 1|AC0 12328,22 
putative cellulose synthase catalytic subunit 

r AraHiHnncic tVmlianal 


197 








Similar to gi|22327|emb|CAA37998.1| com Hageman 
factor inhibitor [Zea mays] 


199 




1096 




ail728RS0knlP0RM0lAMYH YFAST 

till f £OOJ 1/I3L/II VSOlTTV/lxVIVl I 1 1 1 i-z/vcJ X 

GLUCOAMYLASE S1/S2 PRECURSOR 
(GLUCAN 


201 








Elements[GC_box@16653TATA_box@I6019 
ATG@15968 PolyA@10370] Evidence[88% 
(2550/2886) 


203 




1020 


1140 


Similar to gi|3850573|gb|AAC721 1 3. 1 1 Similar to 
gi| 1652733 glycogen operon protein GlgX from 
Synechocystis sp. genome gb|D90908.ESTs 
gb|H36690, gb|AA7 12462, gb|AA651230 and 
gb|N95932 come from this gene. [Arabidopsis thaliana] 


205 


904 


1095 


1130 


Similar to gi|5441877|dbj|BAA82375.1| Similar to 
glycogenin glucosyltransferase (EC 2.4.1 .186). 
(Z97341) [Oryza sativa] 


207 


895 


1076 


1181 


Similar to gi|8777412|dbj|BAA97002.1| indole-3- 



- 165- 
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PCMB02/02450 











glycerol phosphate synthase fArabidopsis thaliana] 


209 




1101 




gi|l 14l56|sp|P13526|ARLC_MAIZE 
ANTHOCYANIN REGULATORY LC PROTEIN 



Table 8 : Genes involved in rice grain filling, which belong to the functional category of storage 
proteins 



Rice 
(SEQ 
ID NO) 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


211 








gi|l 21099|sp|P08079|GDB0_WHEAT GAMMA- 
GLIADIN PRECURSOR 


213 


- 


1044 


1165 


Similar to GL19_ORYSA P29835 ORYZA SATIVA 
(RICE). 19 KD GLOBULIN PRECURSOR 
(ALPHA-GLOBULIN). 


215 








Similar to gi|224389|prfl|l 1 032 1 8A glycinin A5 
[Glycine max] 


217 








Similar to gi|296129|emb|CAA46197.1| prolamin 
[Oryza sativa] 


219 








Similar to gi|7209261|emb|CAB76962.1| alpha-gliadin 
[ I nticum aestivumj 

Similar to gi|4l26695|dbj|BAA36699.1 1 prolamin 
[Oryza sativa] 


III 








Similar to METC_RHILV Q5281 1 RHIZOBIUM 
LEGUMINOSARUM (BIOVAR VICIAE). 
PUTATIVE CYSTATHIONINE BETA-LYASE (EC 
4.4.1.8) (CBL) (BETA-CYSTATHIONASE) 
(CYSTEINE LYASE) (ORF5) (FRAGMENT). 


223 




960 




Similar to GUI l.ORYSA P07728 ORYZA SATIVA 
(RICE). GLUTELIN TYPE I PRECURSOR (CLONE 
PREE61). 


225 




1068 




Similar to gi|226227|prfj|l 502200A prolamin [Avena 
sativa] 


227 




1044 


1165 


gi|232161|sp|P29835|GL19_ORYSA 19 KD 
GLOBULIN PRECURSOR 












229 




960 




Similar to gi|169969|gb|AAA33964.1| glycinin 


231 


948 


953 


1176 


Similar to PRVA.RANCA PI 8087 RANA 
CATESBEIANA (BULL FROG). PARVALBUMIN 
ALPHA (PA 4.97). 



- 166- 



WO 03/000905 



PCT/TO02/02450 



233 




yy 1 




gi|121 l01|sp|P08453|GDB2_WHEAT GAMMA- 
GLIADIN PRECURSOR 


235 




you 




Similar to gi|20227|emb|CAA32566. 1 1 preprolglutelin 
(AA -24 to 476) [Oryza sativa] 


237 




J V 1 J 




similar to rKv J_CJilCK rly OJ OALLUb 
GALLUS (CHICKEN). PARV ALBUMIN, THYMIC 
(AVIAN THYMIC HORMONE) (ATH) (THYMUS - 
SPECIFICANT1GEN Tl). 


239 








Similar to gi|20208|emb|CAA3821 1.1| glutelin [Oryza 
sativa] 


241 








Similar to gi|556407|gb|AAA503 1 9. 1 1 prolamin 


243 








Similar to gi| 1 66555|gb| AAA3271 5. 1 1 avenin 


245 




1048 




gi| 11 705 1 7|sp|P45386|IGA4_HAEIN 
IMMUNOGLOBULIN A 1 PROTEASE 
PRECURSOR 


247 








gi|121090|sp|P04721|GDAl_WHEAT 
ALPHA/BETA-GLIADIN A-I PRECURSOR 


249 








gi|121 101 |sp|P08453|GDB2_WHEAT GAMMA- 
GLIADIN PRECURSOR 



Table 9 : Genes involved in rice grain filling, which belong to the functional category of Fatty Acid 
Metabolism 



Rice 
(SEQ 
ID NO] 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


251 


920 


976 


1131 


Similar to PHLB_SERLI PI 8954 SERRATIA 
LIQUEFACIENS. PHLB PROTEIN PRECURSOR. 


253 




995 




Similar to LPXK.FRANO Q47909 FRANCISELLA 
NOVICIDA. PROBABLE 
TETRAACYLDISACCHARIDE 4 -KINASE (EC 
2.7.1.130) (LIPID A 4 -KINASE). 


255 




972 


1126 


Similar to gi|7339489|emb|CAB82812.1| 
phospholipase-like protein [Arabidopsis thaliana] 


257 




1087 


1177 


Similar to OLE2_ORYSA Q40646 ORYZA SATIVA 
(RICE). OLEOSIN 18 KD (OSE721). 
Similar to gi| 1 1 7 1 354|gb|AAC02240. 1 1 1 8 kDa oleosin 
[Oryza sativa] 


259 




1100 


1132 


Similar to gi|4455257|emb|CAB36756. 1 1 oleosin, 
18.5K [Arabidopsis thaliana] 



- 167- 



WO 03/000905 



PCT7IB02/02450 



261 


910 


1093 


1158 


Similar to KSU5_ECOLI P422 16 ESCHERICHIA 
POT 1 1 riPOYV \a a xrM/"> r\r"n tt ocrvM a TP 

IA/JL1. j-UCUA i -MAJN1NU-UC 1 ULUoUINAJXi 

CYTIDYLYLTRANSFERASE (EC 2.7.7.38) (CMP- 
KDOSYNTHETASE) (CMP-2-KETO-3- 
DEOXYOCTULOSONIC ACID SYNTHETASE) 


263 


884 


1038 


1172 


Similar to ACBPJ30SHI Q39779 GOSSYPIUM 
H1RSUTUM (UPLAND COTTON). ACYL-COA- 


265 


915 


990 


1122 


Similar to gi|4587543|gb|AAD25774. 1|AC006577_10 
ijciongs io me rr|uuoj / Lipase/Acyinyaroiase wiin 
GDSL-motif family.EST gb[AB015099 comes from 
this gene. [Arabidopsis thaliana] 


267 


05*7 


lUoZ 




Similar to GBSBJ3ACSU P71017 BACILLUS 

ci rQTTT tc at rnuni ncuvnD nr:cM a ctt (xzr* 
oUdIIJLIo. ALCUHUL Ufcn Y UKUUclNAoli (rx^ 

1.1.1.1). 


269 




961 




Similar to gi|67 1 4447|gb|AAF26 1 34. 1 |AC0 1 1 620.1 0 
putative phospholipase D [Arabidopsis thaliana] 


271 


- 


1100 


1132 


Similar to gi| 1 1 7 1 352|gb| AAC02239. 1 1 1 6 kDa olepsin 
[Oryza sativa] 

Similar to gi|944830|emb|CAA43183.1| soybean 24 
kDa oleosin isoform [Glycine max] 


273 


886 


1012 


1178 


Similar to gi|7576210|emb|CAB87871.1| palmitqyl- 
protein thioesterase precursor- like [Arabidopsis 
thaliana] 


275 








Similar to 301 U_CUM I h Cju64UI COMAJVIUNAS 
TESTOSTERONI (PSEUDOMONAS 
TESTOSTERON1). 3-OXOSTEROID 1- 
DEHYDROGENASE (EC 1 3 99 4"> 


277 




951 


1160 


Similar to CRTI_PHYBL P54982 PHYCOMYCES 
BLAKESLEEANUS. PHYTOENE 
DEHYDROGENASE (EC 1.3.-.-) (PHYTOENE 
DESATURASE). 


279 




973 




Similar to gi|6648208|gb|AAF2 1 206. 1 |AC013483_30 
putative phosphatidylinositol-4-phosphate 5-kinase 
[Arabidopsis thaliana] 
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Table 10 : Genes involved in rice grain filling, which belong to the functional category of amino acid 
metabolism 



Rice 
(SEQ 
ID NO) 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


281 




1053 




Similar to gi|2076884|gb|AAB539751| lysine- 
ketoglutarate reductase/saccharopine dehydrogenase 
[Arabidopsis thaliana] 


283 


- 


1036 


1199 


Similar to gi|974605|gb|AAA75104.1| single- stranded 
nucleic acid binding protein 


285 




978 




68 1 73.m0 1 963#MAL2 l_29#AT3g20250#RNA- 
binding protein, putativeLength = 955 


287 


y i o 


1008 


1 139 

i i jy 


gi|7301 08|sp|Q00539|NAM8_YEAST NAM 8 
PROTEIN 


289 


928 


1061 




Similar to gi|287298|dbj|BAA03504.1| aspartate 
aminotransferase [Oryza sativa] 


291 




70V/ 


1 141 


Similar to MTAP_HUMAN Q13126 HOMO 
SAPIENS (HUMAN). 5 -METHYLTHIO- 
ADENOSINE PHOSPHORYLASE (EC 2.4.2.28) 
(MTAPHOSPHORYLASE) (MTAPASE). 


293 








Similar to SEPR_THESP P80146 THERMUS SP. 
(STRAIN RT41A). EXTRACELLULAR SERINE 
PROTEINASE PRECURSOR (EC 3.4.21 .-). 


295 


903 


1019 




Similar to gi|6728985|gb|AAF26983.1|AC018363_28 
putative S-adenosylmethionine:2-demethylmenaquinone 
methyltransferase [A thaliana] 


297) 




1092 




68 1 73.m01 963#MAL2 l_29#AT3g20250#RN A- 
binding protein, putativeLength = 955 


299 




986 


1169 


Similar to IF4H_HUMAN Q15056 HOMO 
SAPIENS (HUMAN). EUKARYOTIC 
TRANSLATION INITIATION FACTOR 4H (EIF- 
4H) (KIAA0038). 



5 Table 1 1 : Genes involved in rice grain filling, which belong to the functional category of transcription 
factors 



Rice 
(SEQ 
ID NO) 


Banana 
(SEQ ID 
NO) 


Wheat 
(SEQ ID 
NO) 


Maize 
(SEQ ID 
NO) 


Gene Description 


301 








Similar to gi|721 1973|gb|AAF40444.1|AC004809_2 
Contains similarity to the CREB- binding protein (CBP) 
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from Mus sp gb|S66385. [Arabidopsis thaliana] 


303 




974 




Similar to gi|6899934|emb|CAB71884.1| putative zinc- 
ringer proiem \t\ inaiiciridj 


305 








gi|2493550|sp|Q025 1 6|H AP5.YEAST 
TRANSCRIPTIONAL ACTIVATOR HAP5 


JU/ 


898 


1091 


1201 


Qimilar \r\ <r\\&XSXd 1 Rlohl A A A 1 R4 1 d 11 nUJFd 

oimiiar 10 gi| £ tvj £ r i o|gu|/\/\/\ 1 o*t m. i \ yjorn 


309 








proteinLength = 320 


311 


933 


996 


1129 


Myb family transcription factor 


313 








Myb family transcription factor 


315 


943 


1072 


1119 


Myb family transcription factor 


317 




1007 




Myb family transcription factor 


319 


- 


1013 


1143 


Similariry[afl)07269_37269-38693 

J if * AAA A1 AAII J a * * 1 fl 1/* * II 

/gene- a_ig002n01 .20 /protein_id="aab61 027. 1 
/notep="contains weak similarity to myb-related 
proteins ] Evidence[ 100% (559/559)] 


321 




1097 


1135 


Motifs{Myb_2 Myb DNA-binding domain repeat; 
Myb_2 Myb DNA-binding domain repeat} 
iividencepo/o (jvo/oW)} 


323 


940 


981 


1197 


Similar to gi|2894607|emb|CAA 1 7 14 1 . 1 1 NAM (no 
apical meristem)-Iike protein [Arabidopsis thaliana] 


325 






1171 


Similar to gi|2224929|gb|AAC49747.1| ethylene- 
insensitive3-like2 [Arabidopsis thaliana] 


327 




979 


1174 


Myb DNA-binding domain repeat; Myb_2 Myb 
DNA-binding domain repeat; Myb_2 Myb DNA- 
binding domain repeat} Evidence[69% (615/879)] 



Example 5: Rice Ortholo£S of Arabidopsis Grain Filling Genes Identified by Reverse 
Genetics 

Understanding the function of every gene is the major challenge in the age of completely 
5 sequenced eukaryotic genomes. Sequence homology can be helpful in identifying possible functions 
of many genes. However, reverse genetics, the process of identifying the function of a gene by 
obtaining and studying the phenotype of an individual containing a mutation in that gene, is another 
approach to identify the function of a gene. 
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Reverse genetics in Arabidopsis has been aided by the establishment of large publicly 
available collections of insertion mutants (Krysan et al., (1999) Plant Cell 1 1 , 2283-2290; Tisser et 
al, (1999) Plant Cell 11, 1 841 -1 852; Speulman et al., (1999). Plant Cell 11, 1853-1866; Parinov 
et al., (1999) . Plant Cell 11, 2263-2270; Parinov and Sundaresan, 2000; Biotechnology 11, 157- 

5 161). Mutations in genes of interest are identified by screening the population by PCR amplification 
using primers derived from sequences near the insert border and the gene of interest to screen 
through large pools of individuals. Pools producing PCR products are confirmed by Southern 
hybridization and further deconvolved into subpools until the individual is identified (Sussman et al., 
(2000) Plant Physiology 124, 1465-1467). 

10 Recently, some groups have begun the process of sequencing insertion site flanking regions 

from individual plants in large insertion mutant populations, in effect prescreening a subset of lines for 
genomic insertion sites (Parinov et al., (1999) . Plant Cell 11, 2263-2270; Tisser et al., (1999) . 
Plant Cell 11, 1 84 1 - 1 852). The advantage to this approach is that the laborious and time- 
consuming process of PCR- based screening and deconvolution of pools is avoided. 

15 A large database of insertion site flanking sequences from approximately 100,000 T-DNA 

mutagenized Arabidopsis plants of the Columbia ecotype (GARLIC lines) is prepared. T-DNA left 
border sequences from individual plants are amplified using a modified thermal asymmetric 
interlaced-polymerase chain reaction (TAIL-PCR) protocol (Liu et al., (1995) . Plant J. 8, 457- 
463). Left border TAIL-PCR products are sequenced and assembled into a database that 

20 associates sequence tags with each of the approximately 100,000 plants in the mutant collection. 
Screening the collection for insertions in genes of interest involves a simple gene name or sequence 
BLAST query of the insertion site flanking sequence database, and search results point to individual 
lines. Insertions are confirmed using PCR. 

Analysis of the GARLIC insert lines suggests that there are 76,856 insertions that localize to a 

25 subset of the genome representing coding regions and promoters of 22,880 genes. Of these, 49,231 
insertions lie in the promoters of over 18,572 genes, and an additional 27,625 insertions are located 
within the coding regions of 13,61 2 genes. Approximately 25,000 T-DNA left border mTAIL-PCR 
products (25% of the total 102,765) do not have significant matches to the subset of the genome 
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representing promoters and coding regions, and are therefore presumed to lie in noncoding and/or 
repetitive regions of the genome. 

The Arabidopsis T-DNA GARLIC insertion collection is used to investigate the roles of 
certain genes in the grain filling process. Target genes are chosen using a variety of criteria, including 

5 public reports of mutant phenotypes, RNA profiling experiments, and sequence similarity to genes 
implicated in grain filling. Plant lines with insertions in genes of interest are then identified. Each T- 
DNA insertion line is represented by a seed lot collected from a plant that is hemizygous for a 
particular T-DNA insertion. Plants homozygous for insertions of interest are identified using a PCR 
assay. The seed produced by these plants is homozygous for the T-DNA insertion mutation of 

10 interest. 

Homozygous mutant plants are tested for altered grain composition. The genes interrupted in 
these mutants contribute to the observed phenotype. The genes interrupted in these mutants interfere 
with the normal grain filling process. 

Rice orthologs of the Arabidopsis genes affecting the grain filling process and thus grain 
15 composition are identified by similarity searching of a rice database using the Double- Affine Smith- 
Waterman algorithm (BLASP with e values better than ~ 10 ). 

Example 6 : Cloning and Sequencing of Nucleic Acid Molecules from Rice 

6.1 Genomic DNA: Plant genomic DNA samples are isolated from a collection of tissues 
20 which are listed in Table 1. Individual tissues are collected from a minimum of five plants and pooled. 
DNA can be isolated according to one of the three procedures, e.g., standard procedures described 
by Ausubel et al. (1995), a quick leaf prep described by Klimyuk et al. (1 993), or using FTA paper 
(Life Technologies). 

For the latter procedure, a piece of plant tissue such as, for example, leaf tissue is excised 
25 from the plant, placed on top of the FTA paper and covered with a small piece of parafilm that 

serves as a barrier material to prevent contamination of the crushing device. In order to drive the sap 
and cells from the plant tissue into the FTA paper matrix for effective cell lysis and nucleic acid 
entrapment, a crushing device is used to mash the tissue into the FTA paper. The FTA paper is air 
dried for an hour. For analysis of DNA, the samples can be archived on the paper until analysis. 
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Two mm punches are removed from the specimen area on the FTA paper using a 2 mm Harris 
Micro Punch™ and placed into PCR tubes. Two hundred (200) microliters of FTA purification 
reagent is added to the tube containing the punch and vortexed at low speed for 2 seconds. The 
tube is then incubated at room temperature for 5 minutes. The solution is removed with a pipette so 
5 as to repeat the wash one more time. Two hundred (200) microliters of TE (10 mM Tris, 0.1 mM 
EDTA, pH 8.0) is added and the wash is repeated two more times. The PCR mix is added directly 
to the punch for subsequent PCR reactions. 

6.2 Cloning of Candidate cDNA: A candidate cDNA is amplified from total RNA isolated 
from rice tissue after reverse transcription using primers designed against the computationally 
10 predicted cDNA. Primers designed based on the genomic sequence can be used to PCR amplify the 
'foil-length cDNA (start to stop codon) from first strand cDNA prepared from rice cultivar 
Nipponbare tissue. 

The Qiagen RNeasy kit (Qiagen, Hilden, Germany) is used for extraction of total RNA. The 
Superscript II kit (Invitrogen, Carlsbad, USA) is used for the reverse transcription reaction. PCR 

15 amplification of the candidate cDNA is carried out using the reverse primer sequence located at the 
translation start of the candidate gene in 5' - 3* direction. This is performed with high-fidelity Taq 
polymerase (Invitrogen, Carlsbad, USA). 

The PCR fragment is then cloned into pCR2.1-TOPO (Invitrogen) or the pGEM-T easy 
vector (Promega Corporation, Madison, Wis., USA) per the manufacturer's instructions, and 

20 several individual clones are subjected to sequencing analysis. 

63 DNA sequencing DNA preps for 2-4 independent clones are miniprepped following 
the manufacturer's instructions (Qiagen). DNA is subjected to sequencing analysis using the 
BigDye™ Terminator Kit according to manufacturer's instructions (ABI) . Sequencing makes use 
of primers designed to both strands of the predicted gene of interest. DNA sequencing is performed 

25 using standard dye-terminator sequencing procedures and automated sequencers (models 373 and 
377; Applied Biosystems, Foster City, CA). All sequencing data are analyzed and assembled using 
the Phred/Phrap/Consed software package (University of Washington) to an error ratio equal to or 
less than 10" 4 at the consensus sequence level. 
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The consensus sequence from the sequencing analysis is then to be validated as being intact 
and the correct gene in several ways. The coding region is checked for being full length (predicted 
start and stop codons present) and uninterrupted (no internal stop codons). Alignment with the gene 
prediction and BLAST analysis is used to ascertain that this is in fact the right gene. 
5 The clones are sequenced to verify their correct amplification. 

Example 7 : Functional analysis in plants 

A plant complementation assay can be used for the functional characterization of the grain 
filling genes according to the invention. 

10 Rice and Arabidopsis putative orthologue pairs are identified using BLAST comparisons, 

TFASTXY comparisons, and Double- Affine Smith- Waterman similarity searches. Constructs 
containing a rice cDNA or genomic clone inserted between the promoter and terminator of the 
Arabidopsis orthologue are generated using overlap PCR (Gene 77, 61-68 (1989)) and 
GATEWAY cloning (Life Technologies Invitrogen). For ease of cloning, rice cDN A clones are 

15 preferred to rice genomic clones. A three stage PCR strategy is used to make these constructs. 

(1 ) In the first stage, primers are used to PCR amplify: (i) 2Kb upstream of the translation 
start site of the Arabidopsis orthologue, (ii) the coding region or cDNA of the rice orthologue, and 
(iii) the 500 bp immediately downstream of the Arabidopsis orthogue's translation stop site. Primers 
are designed to incorporate onto their 5' ends at least 16 bases of the 3* end of the adjacent 

20 fragment, except in the case of the most distal primers which flank the gene construct (the forward 
primer of the promoter and the reverse primer of the terminator). The forward primer of the 
promoters contains on their 5' ends partial AttBl sites, and the reverse primer of the terminators 
contains on their 5' ends partial AttB2 sites, for Gateway cloning. 

(2) In the second stage, overlap PCR is used to join either the promoter and the coding 
25 region, or the coding region and the terminator. 

(3) In the third stage either the promoter-coding region product can be joined to the 
terminator or the coding region- terminator product can be joined to the promoter, using overlap 
PCR and amplification with fulll Att site-containing primers, to link all three fragments, and put full 
Att sites at the construct termini. 



- 174- 



WO 03/000905 



PCT/IB02/02450 



The fused three- fragment piece flanked by Gateway cloning sites are introduced into the LT1 
donor vector pDONR201 (Invitrogen) using the BP clonase reaction, for confirmation by 
sequencing. Confirmed sequenced constructs are introduced into a binary vector containing 
Gateway cloning sites, using the LR clonase reaction such as, for example, pAS200. 

The pAS200 vector was created by inserting the Gateway cloning cassette RfA into the 
Acc65I site ofpNOV3510. 

pNOV3510 was created by ligation of inverted pNOV21 14 VSI binary into pNOV3507, a 
vector containing a PTX5' Arab Protox promoter driving the PPO gene with the Nos terminator. 

pNOV21 14 was created by insertion of viiGN54D (Pazour et al 1992, J . Bacteriol. 
174:4169-4174) from pAD1289 (Hansen et al 1994, PNAS 91 :7603-7607) into pHiNK085. 

pHiNK085 was created by deleting the 35S:PMI cassette and Ml 3 ori in pVictorHiNK. 

pPVictorHiNK was created by modifying the T-DNA of pVictor (described in WO 
97/041 12) to delete Ml 3 derived sequences and to improve its cloning versatility by introducing the 
BIGLINK polylinker. 

The sequence of the pVictor HiNK vector is disclosed in SEQ ID NO: 5 in WO 00/6837, 
which is incorporated herein by reference. The pVictorHiNK vector contains the following 
constituents that are of functional importance: 

The origin of replication (ORI) functional in Agrobacterium is derived from the 
Pseudomonas aeruginosa plasmid pVSl (Itoh et al 1984. Plasmid 1 1 : 206-220; Itoh and 
Haas, 1985. Gene 36: 27-36). The pVSl ORI is only functional in Agrobacterium and can 
be mobilised by the helper plasmid pRK2013 from E.coli into A tumefaciens by means of a 
triparental mating procedure (Ditta et al, 1980. Proc. Natl. Acad. Sci USA 77: 7347-7351). 

The ColEl origin of replication functional in E. coli is derived from pUC19 (Yannisch- 
Perron et al, 1985. Gene 33: 103-1 19). 

The bacterial resistance to spectinomycin and streptomycin encoded by a 0.93 kb 
fragment from transposon Tn7 (Fling et al, 1985. Nucl. Acids Res. 13: 7095) functions as 
selectable marker for maintenance of the vector in E. coli and Agrobacterium .The gene is 
fused to the /ac promoter for efficient bacterial expression (Amman et al, 1 983. Gene 25: 
167-178). 
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The right and left T-DNA border fragments of 1 .9 kb and 0.9 kb that comprise 
the 24 bp border repeats, have been derived from the Ti-plasmid of the 
nopaline type Agrobacterium tumefaciens strains pTiT37 (Yadav et a/., 
1982. Proc. Natl. Acad. Sci. USA. 79: 6322-6326). 

The plasmid is introduced into Agrobacterium tumefaciens GV3 1 01 pMP90 by 
electroporation. The positive bacterial transformants are selected on LB medium containing 50 pg/jd 
kanamycin and 25 pg/nl gentamycin. Plants are transformed by standard methodology (e.g., by 
dipping flowers into a solution containing the Agrobacterium) except that 0.02% Silwet -77 (Lehle 
Seeds, Round Rock, TX) is added to the bacterial suspension and the vacuum step omitted. Five 
hundred (500) mg of seeds are planted per 2 ft 2 flat of soil and , and progeny seeds are selected for 
transformants using PPO selection. 

Primary transformants are analyzed for complementation. Primary transformants are 
genotyped for the Arabidopsis mutation and presence of the transgene. When possible, >50 mutants 
harboring the transgene should be phenotyped to observe variation due to transgene copy number 
and expression 

Example 8: Vector construction for overexpression and gene "knockout " experiments 
8.1 Overexpression 

Vectors used for expression of full-length "grain filling candidate genes" of interest in plants 
(overexpression) are designed to overexpress the protein of interest and are of two general types, 
biolistic and binary, depending on the plant transformation method to be used. 

For biolistic transformation (biolistic vectors), the requirements are as follows: 

1 . a backbone with a bacterial selectable marker (typically, an antibiotic resistance gene) and 
origin of replication functional in Escherichia coli (E. coli; eg. ColEl), and 

2. a plant-specific portion consisting of: 
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a. a gene expression cassette consisting of a promoter (eg. ZmUBlint MOD), the gene 
of interest (typically, a full-length cDNA) and a transcriptional terminator (eg. 
Agrobacterium tumefaciens nos terminator); 

b. a plant selectable marker cassette, consisting of a promoter (eg. rice Act 1 D-BV 
MOD), selectable marker gene (eg. phosphomannose isomerase, PMI) and 
transcriptional terminator (eg. CaMV terminator). 

Vectors designed for transformation by Agrobacterium tumefaciens {A. tumefaciens; binary 
vectors) consist of: 

1 . a backbone with a bacterial selectable marker functional in both E, coli and A. tumefaciens 
(eg. spectinomycin resistance mediated by the aadA gene) and two origins of replication, 
functional in each of aforementioned bacterial hosts, plus the A tumefaciens virG gene; 

2. a plant-specific portion as described for biolistic vectors above, except in this instance this 
portion is flanked by A. tumefaciens right and left border sequences which mediate transfer 
of the DNA flanked by these two sequences to the plant. 

8.2 Knock out vectors 

Vectors designed for reducing or abolishing expression of a single gene or of a family or 
related genes (knockout vectors) are also of two general types corresponding to the methodology 
used to downregulate gene expression: antisense or double-stranded RNA interference (dsRNAi). 

(a) Anti-sense 

For antisense vectors, a full-length or partial gene fragment (typically, a portion of the cDNA) 
can be used in the same vectors described for full-length expression, as part of the gene expression 
cassette. For antisense- mediated down-regulation of gene expression, the coding region of the gene 
or gene fragment will be in the opposite orientation relative to the promoter, thus, mRNA will be 
made from the non-coding (antisense) strand in planta. 

(b) dsRNAi 

For dsRNAi vectors, a partial gene fragment (typically, 300 to 500 basepairs long) is used in 
the gene expression cassette, and is expressed in both the sense and antisense orientations, 
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separated by a spacer region (typically, a plant intron, eg. the OsSHl intron 1, or a selectable 
marker, eg. conferring kanamycin resistance). Vectors of this type are designed to form a double- 
stranded mRNA stem, resulting from the basepairing of the two complementary gene fragments in 
planta. 

5 

Biolistic or binary vectors designed for overexpression or knockout can vary in a number of 
different ways, including eg. the selectable markers used in plant and bacteria, the transcriptional 
terminators used in the gene expression and plant selectable marker cassettes, and the methodologies 
used for cloning in gene or gene fragments of interest (typically, conventional restriction enzyme- 

10 mediated or Gateway™ recombinase-based cloning). An important variant is the nature of the gene 
expression cassette promoter driving expression of the gene or gene fragment of interest in most 
tissues of the plants (constitutive, eg. ZmUBIint MOD), in specific plant tissues (eg. maize ADP-gpp 
for endosperm- specific expression), or in an inducible fashion (eg. GAL4bsBzl for estradiol- 
inducible expression in lines constitutively expressing the cognate transcriptional activator for this 

15 promoter). 

Example 9: Insertion of a "grain filling candidate gene "J into Expression Vector 

A validated rice cDNA clone in pCR2.1-TOPO or the pGEM-T easy vector is subcloned 
using conventional restriction enzyme-based cloning into a vector, downstream of the maize ubiquitin 

20 promoter and intron, and upstream of the Agrobacterium tumefaciens nos 3' end transcriptional 
terminator. The resultant gene expression cassette (promoter, "grain filling candidate gene" and 
terminator) is further subcloned, using conventional restriction enzyme-based cloning, into the 
pNOV21 17 binary vector (Negrotto et al (2000) Plant Cell Reports 19, 798-803; plasmid 
pNOVl 17 discosed in this article corresponds to pNOV21 17 described herein; ; the nucleotide 

25 sequence of pNOV21 1 7 is provided in SEQ ID NO: 44 of WO 01/73087), generating 
pNOVCAND. 

The pNOVCAND binary vector is designed for transformation and over-expression of the 
"grain filling candidate gerie" in monocots. It consists of a binary backbone containing the sequences 
necessary for selection and growth in Escherichia coli DH-5a (Invitrogen) and Agrobacterium 
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tumefaciens LBA4404 (pAL4404; pSBl), including the bacterial spectinomycin antibiotic 
resistance aadA gene from E. coli transposon Tn7, origins of replication foxE. coli (ColEl) and A. 
tumefaciens (VS1), and the A. tumefaciens virG gene. In addition to the binary backbone, which 
is identical to that of pNOV21 14 described herein previously (see Example 7 above), pNOV21 17 
5 contains the T-DNA portion flanked by the right and left border sequences, and including the 

Positech™ (Syngenta) plant selectable marker (WO 94/20627) and the "grain filling candidate gene" 
gene expression cassette. The Positech™ plant selectable marker confers resistance to mannose 
and in this instance consists of the maize ubiquitin promoter driving expression of the PMJ 
(phosphomannose isomerase) gene, followed by the cauliflower mosaic virus transcriptional 
10 terminator. 

Plasmid pNOV21 17 is introduced into Agrobacterium tumefaciens LBA4404 (pAL4404; 
pSBl) by electroporation. Plasmid pAL4404 is a disarmed helper plasmid (Ooms et al (1982) 
Plasmid 7, 1 5-29). Plasmid pSBl is a plasmid with a wide host range that contains a region of 
homology to pNOV21 17 and a 15.2 kb Kpnl fragment from the virulence region of pTiBo542 
15 (Ishida et al (1996) Nat Biotechnol 14, 745-750). Introduction of plasmid pNOV21 17 into 
Agrobacterium strain LBA4404 results in a co- integration of pNOV21 17 and pSBl. 

Alternatively, plasmid pCIB7613, which contains the hygromycin phosphotransferase (hpt) 
gene (Gritz and Davies, Gene 25, 179-188, 1983) as a selectable marker, may be employed for 
transformation. 

20 Plasmid pCIB761 3 (see WO 98/06860, incorporated herein by reference in its entirety) is 

selected for rice transformation. In pCIB761 3, the transcription of the nucleic acid sequence coding 
hygromycin-phosphotransferase (HYG gene) is driven by the com ubiquitin promoter (ZmUbi) and 
enhanced by com ubiquitin intron 1 . The 3 , poIyadenylation signal is provided by NOS 3' 
nontranslated region. 

25 Other useful plasmids include pNADII002 (GAL4-ER-VP16) which contains the yeast GAL4 

DNA Binding domain (Keegan et al., Science, 23 1 :699 (1986)), the mammalian estrogen receptor 
ligand binding domain (Greene et al. Science, 231:1150(1 986)) and the transcriptional activation 
domain of the HSV VP16 protein (Triezenberg et al,1988). Both hpt and GAL4-ER-VP16 are 
constitutively expressed using the maize Ubiquitin promoter, and pSGCDLl (GAL4BS Bzl 
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Luciferase), which carries the firefly luciferase reporter gene under control of a minimal maize 
Bronze 1 (Bzl) promoter with 10 upstream synthetic GAL4 binding sites. All constructs use 
termination signals from the nopaline synthase gene. 

5 Example 10: Plant Transformation 

1 0.1 Rice Transformation 
pNOVCAND is transformed into a rice cultivar (Kaybonnet) using Agrobacterium- mediated 
transformation, and mannose-resistant calli are selected and regenerated 
10 Agrobacterium is grown on YPC solid plates for 2-3 days prior to experiment initiation. 

Agrobacterial colonies are suspended in liquid MS media to an OD of 0.2 at A600nm. 
Acetosyringone is added to the agrobacterial suspension to a concentration of 200)iM and agro is 
induced for30min. 

Three- week-old calli which are induced from the scutellum of mature seeds in the N6 medium 
15 (Chu, C.C. et al., Sci, Sin., 1 8, 659-668(1975)) are incubated in the agrobacterium solution in a 
100 x 25 petri plate for 30 minutes with occasional shaking. The solution is then removed with a 
pipet and the callus transfered to a MSAs medium which is overlayed with sterile filter paper. 
Co-Cultivation is continued for 2 days in the dark at 22°C. 

Calli are then placed on MS-Timetin plates for 1 week. After that they are tranfered to PAA 
20 + mannose selection media for 3 weeks. 

Growing calli (putative events) are picked and transfered to PAA+ mannose media and 
cultivated for 2 weeks in light. 

Colonies are tranfered to MS20SorbKinTim regeneration media in plates for 2 weeks in light. 
Small plantlets are transferred to MS20SorbKinTim regeneration media in GA7 containers. When 
25 they reach the lid, they are transfered to soil in the greenhouse. 

Expression of the "grain filling candidate gene" in transgenic T 0 plants is analyzed. Additional 
rice cultivars, such as but not limited to, Nipponbare, Taipei 309 and Fuzisaka 2 are also 
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transformed and assayed for expression of the "grain filling candidate gene" product and enhanced 
protein expression. 

10.2 Maize transformation 
5 Transformation of immature maize embryos is performed essentially as described in Negrotto et 
al., (2000) Plant Cell Reports 19: 798-803, For this example, all media constituents are as 
described in Negrotto et al, supra. However, various media constituents described in the literature 
may be substituted. 

10 1 . Transformation plasmids and selectable marker 

The genes used for transformation are cloned into a vector suitable for maize transformation as 
described in Example 17. Vectors used contain the phosphomannose isomerasc (PMI) gene 
(Negrotto et al. (2000) Plant Cell Reports 1 9: 798-803). 

15 2. Preparation of Agrobacterium tumefaciens 

Agrobacterium strain LBA4404 (pSBl) containing the plant transformation plasmid is grown 
on YEP (yeast extract (5 g/L), peptone (10g/L),NaCI (5g/L),15g/l agar, pH 6.8) solid medium for 2 
to 4 days at 28°C. Approximately 0.8X 10 9 Agrobacteria are suspended in LS-inf media 
supplemented with 100 fiM acetosyringone (As) (Negrotto et a/.,(2000) Plant Cell Rep 19: 798- 

20 803). Bacteria are pre- induced in this medium for 30-60 minutes. 

3. Inoculation 

Immature embryos from A 188 or other suitable maize genotypes are excised from 8-12 day 
old ears into liquid LS-inf + 100 \iM As. Embryos are rinsed once with ffesh infection medium. 
25 Agrobacterium solution is then added and embryos are vortexed for 30 seconds and allowed to 
settle with the bacteria for 5 minutes. The embryos are then transferred scutellum side up to LSAs 
medium and cultured in the dark for two to three days. Subsequently, between 20 and 25 embryos 
per petri plate are transferred to LSDc medium supplemented with cefotaxime (250 mg/1) and silver 
nitrate (1.6 mg/1) and cultured in the dark for 28°C for 1 0 days. 
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4. Selection of transformed cells and regeneration of transformed plants 

Immature embryos producing embryogenic callus are transferred to LSD1M0.5S medium. 
The cultures are selected on this medium for 6 weeks with a subculture step at 3 weeks. Surviving 

5 calli are transferred either to LSD1M0.5S medium to be bulked-up or to Reg] medium. Following 
culturing in the light (16 hour light/ 8 hour dark regiment), green tissues are then transferred to Reg2 
medium without growth regulators and incubated for 1-2 weeks. Plantlets are transferred to 
Magenta GA-7 boxes (Magenta Corp, Chicago 111.) containing Reg3 medium and grown in the light. 
Plants that are PCR positive for the promoter-reporter cassette are transferred to soil and grown in 

10 the greenhouse. 

Example 11: Promoter Analysis 

The gene chip experiment described above in Examples 3 and 4 are designed to uncover 
genes that are expressed in seed tissue during grain filling. Candidate promoters are identified based 
15 upon the expression profiles of the associated transcripts representatives of which are provided in 
SEQIDNOs: 643-883. 

Candidate promoters are obtained by PCR and fused to a GUS reporter gene containing an 
intron. Both histochemical and fluormetric GUS assays are carried out on stably transformed rice and 
maize plants and GUS activity is detected in the transformants. 
20 Further, transient assays with the promoter :GUS constructs are carried out in rice 

embryogenic callus and GUS activity is detected by histochemical staining according the protocol 
described below (see Example 12). 

Construction of Binary Promoter: : Reporter Plasmids 

To construct a binary promoter.: reporter plasmid for rice transformation a vector containing a 
25 promoter of interest (i.e., the DNA sequence 5' of the initiation codon for the gene of interest) is 

used, which results from recombination in a BP reaction between a PCR product using the promoter 
of interest as a template and pDONR201™, producing an entry vector. The regulatory/promoter 
sequence is fused to the GUS reporter gene (Jefferson et al, 1987) by recombination using 
GATEWAY™ Technology according to manufacturers protocol as described in the Instruction 
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Manual (GATEWAY™ Cloning Technology, GIBCO BRL, Rockville, MD 

http://www.lifetech.com/). 

Briefly, the Gateway Gus-intron-Gus (GIG)/NOS expression cassette is ligated into 

pNOV21 17 binary vector in 5* to 3* orientation. The 4.1 kB expression cassette is ligated into the 
5 Kpn-I site of pNOV21 17, then clones are screened for orientation to obtain pNOV2346, a 

GATEWAY™ adapted binary destination vector. 

The promoter fragment in the entry vector is recombined via the LR reaction with the binary 

destination vector containing the GUS coding region with an intron that has an a//R site 5* to the 

GUS reporter, producing a binary vector with a promoter fused to the GUS reporter 
10 (pNOVCANDProm). The orientation of the inserted fragment is maintained by the att sequences 

and the final construct is verified by sequencing. The construct is then transformed into 

Agrobacterium tumefaciens strains by elecrroporation as described herein previously (see Example 

9). 



1 5 Example 12: Transient Expression Analysis of Candidate Promoters in Rice Embryogenic 
Callus 

Materials: 

■ Embryogenic rice callus (Kavbonett cultivar) 

■ LBA 4404 Agrobacterium strains 

20 ■ KCMS liquid media for re-suspending bacterial pellet 

■ 2Q0mM stock (40mg/ml) Acetosyringone 

■ Sterile filter paper discs (8.5mm in diameter) 

■ LB spec liquid culture 
" MS-CIM media plates 

25 ■ MS- AS plates (co-cultivation plates) 

■ MS-Tim plates (recovery plates) 

■ Gus staining solution 

Methods: 

Induction of Embryogenic callus: 
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1. Sterilize mature Kaybonett rice seeds in 40% ultra Clorox, 1 drop Tween 20, for 40min. 

2. Rinse with sterile water and plate on MS-CIM media (12 seeds/plate) 

3. Grow in dark for four weeks. 

4. Isolate embryogenic calli from scutellum to MS-CIM 

5. Let grow in dark 8 days before use for transformation 



Azrobacterium preparation and induction: 

1. Start 6mL shaking cultures of LBA4404 Aerobacterium strains harboring rice promoter 
binary plasmids. 

2. Grow the cultures at room temperature for 48hrs in the rotary shaker. 

3. Spin down the cultures at 8'OOOrpm at 4°C and re-suspend bacterial pellets in 10ml of 
KCMS media supplemented with 100? M Acetosyringone. 

4. Place in the shaker at room temp for lhr for induction of Aerobacterium virulence genes. 

5. In a sterile hood dilute Aerobacterium cultures 1 :3 in KSMS media and transfer diluted 
cultures into deep petri dishes. 

Inoculation of plant material and staining: 

6. In a sterile hood transfer embryogenic callus into diluted Azrobacerium solution and 
incubate for 30 minutes. 

7. In a sterile hood blot callus tissue on sterile filter paper and transfer on MS- AS plates. 

8. Co-culture plates in 22°C growth chamber in the dark for two days. 

9. In a sterile hood transfer callus tissue to MS- Tim plates for the tissue recovery (the presence 
of Timentin will prevent Aerobacterium growth). 

10. Incubate tissue on MS- Tim media for two days at 22°C in the dark. 

1 1 . Remove callus tissue from the plates and stain for 48hrs. in GUS staining solution. 

12. De-stain tissue in 70% EtOH for 24 hours. 
Recipies: 

KCMS media (liquid). pH to 5.5 

100ml/l MS Major Salts, 10ml/l MS Minor Salts, 5ml/l MS iron stock, 0.5M K 2 HP0 4 , 0.lmg/ml 
Myo- Inositol, 
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1.3 Jig/ml Thiamine, 0.2 gfail 2,4-D (1 mg/ml), O.lg/ml Kinetin, 3% Sucrose, 100? M Acetosyringo 

MS-CIM media. pH 5.8 
MS Basal salt (4.3g/L), B5 Vitamins (200 X) (5m/L), 2% Sucrose (20g/L), Proline 
5 (500mg/L), Glutamine (500mg/L), Casein Hydrolysate (300mg/L), 2? g/m! 2,4- D, 

Phytagel (3g/L) 
MS- As Medium, pH 5.8 

MS Basal salt (4.3g/L), B5 Vitamins (200 X) (5m/L), 2% Sucrose (20g/L), Proline 
(500mg/L), Glutamine (500mg/L), Casein Hydrolysate (300mg/L), 2? g/ml 2,4-D, 
10 Phytagel (3g/L), 200 ? M Acetosyringone 

MS-Tim media, pH 5.8 
MS Basal salt (4.3g/L), B5 Vitamins (200 X) (5m/L), 2% Sucrose (20g/L), Proline 
(500mg/L), Glutamine (500mg/L), Casein Hydrolysate (300mg/L), 2? g/ml 2,4-D, 
15 Phytagel (3g/L), 400mgfl Timentin 

Gus staining solution, pH 7 

0.3M Mannitol; 0.02M EDTA, pH=7.0; 0.04 NaH 2 P0 4 ; ImM x-ghic 

The binary Promoter : Reporter Plasmids described in Example 9 above can also be used for stable 
20 transformation of rice and maize plants according to the protocols provided in Examples 10.1 and 
10.2, respectively. 

Example 13: Analysis of mutant and transgenic plant material 

Two tiers of assays are can be used for analysis of the mutant and transgenic plant material. 

25 -Near InfraRed (NIR) spectrophometric analysis of seeds. 

NIR enables evaluation of changes in starch, oil, protein and fiber content at very high throughput (1 
sample/sec). 

-DIA or MRI imaging 



- 185- 



WO 03/000905 



PCT/IB02/02450 



D1A or MRI imaging allows obseivation of gross morphology and surface area of major seed tissues 
and compartments (embryo, aleurone, endosperm, seed coat). Transgenic lines can also be 
physically sectioned and directly observed for changes in seed compartment morphology. 

Lines showing alterations in grain composition will be advanced to a second tier of assays dependent 
5 upon the nature of the change detected: 

I) Protein track: 1-D and 2-D protein gels Protein profiles 
HPLC Amino acid profiles 
DNTB or papain staining Protein redox status 
GCN/C/S ratios 

10 2) Starch track: Iodine staining Content, branching 
Glucose-6-P analysis Phosphorylation level 

3) Oils track: GC Oil, fatty acid profile 

15 Example 14: Chromosomal Markers to Identify the Location of a Nucleic Acid Sequence 
The sequences of the present invention can also be used for SSR mapping. SSR mapping in 
rice has been described by Miyao et al. {DNA Res 3:233 (1996)) and Yang et al (Mol Gen Genet 
245:187 (1994)), and in maize by Ahn et al. {Mol Gen Genet 241:483 (1993)). SSR mapping can 
be achieved using various methods. In one instance, polymorphisms are identified when sequence 

20 specific probes flanking an SSR contained within a sequence are made and used in polymerase chain 
reaction (PCR) assays with template DNA from two or more individuals or, in plants, near isogenic 
lines. A change in the number of tandem repeats between the SSR- flanking sequence produces 
differently sized fragments (U.S. Patent No. 5,766,847). Alternatively, polymorphisms can be 
identified by using the PCR fragment produced from the SSR- flanking sequence specific primer 

25 reaction as a probe against Southern blots representing different individuals (Refseth et al y 

Electrophoresis 18:1519(1 997)). Rice SSRs can be used to map a molecular marker closely 
linked to functional gene, as described by Akagi et al {Genome 39:205 (1996)). 
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The sequences of the present invention can be used to identify and develop a variety of 
microsatelHte markers, including the SSRs described above, as genetic markers for comparative 
analysis and mapping of genomes. 

Many of the polynucleotides listed in Tables 2 to 1 1 contain at least 3 consecutive di-, tri- or 
5 tetranucleotide repeat units in their coding region that can potentially be developed into SSR 
markers. Trinucleotide motifs that can be commonly found in the coding regions of said 
polynucleotides and easily identified by screening the polynucleotides sequences for said motifs are, 
for example: CGG; GCC, CGC, GGC, etc. Once such a repeat unit has been found, primers can 
be designed which are complementary to the region flanking the repeat unit and used in any of the 
1 0 methods described below. 

Sequences of the present invention can also be used in a variation of the SSR technique 
known as inter-SSR (ISSR), which uses microsatellite oligonucleotides as primers to amplify 
genomic segments different from the repeat region itself (Zietkiewicz et al. 9 Genomics 20: 176 
(1994)). ISSR employs oligonucleotides based on a simple sequence repeat anchored or not at their 
15 5'- or 3' -end by two to four arbitrarily chosen nucleotides, which triggers site-specific annealing and 
initiates PCR amplification of genomic segments which are flanked by inversely orientated and 
closely spaced repeat sequences. In one embodiment of the present invention, microsatellite 
markers as disclosed herein, or substantially similar sequences or allelic variants thereof, may be used 
to detect the appearance or disappearance of markers indicating genomic instability as described by 
20 Leroy et al. (Electron. J Biotechnol, 3(2), at http://www.ejb.org (2000)), where alteration of a 
fingerprinting pattern indicated loss of a marker corresponding to a part of a gene involved in the 
regulation of cell proliferation. Microsatellite markers are useful for detecting genomic alterations 
such as the change observed by Leroy et al (Electron. J Biotechnol, 3(2), supra (2000)) which 
appeared to be the consequence of microsatellite instability at the primer binding site or modification 
25 of the region between the microsatellites, and illustrated somaclonal variation leading to genomic 
instability. Consequently, sequences of the present invention are useful for detecting genomic 
alterations involved in somaclonal variation, which is an important source of new phenotypes. 

In addition, because the genomes of closely related species are largely syntenic (that is, they 
display the same ordering of genes within the genome), these maps can be used to isolate novel 
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alleles from wild relatives of crop species by positional cloning strategies. This shared synteny is veiy 
powerful for using genetic maps from one species to map genes in another. For example, a gene 
mapped in rice provides information for the gene location in maize and wheat. 

5 Example 15: Quantitative Trait Linked Breeding 

Various types of maps can be used with the sequences of the invention to identify Quantitative 
Trait Loci (QTLs) for a variety of uses, including marker- assisted breeding. Many important crop 
traits are quantitative traits and result from the combined interactions of several genes. These genes 
reside at different loci in the genome, often on different chromosomes, and generally exhibit multiple 

10 alleles at each locus. Developing markers, tools, and methods to identify and isolate the QTLs 
involved in a trait, enables marker-assisted breeding to enhance desirable traits or suppress 
undesirable traits. The sequences disclosed herein can be used as markers for QTLs to assist 
marker-assisted breeding. The sequences of the invention can be used to identify QTLs and isolate 
alleles as described by Li et ai. in a study of QTLs involved in resistance to a pathogen of rice. (Li 

15 et al. y Mol Gen Genet 261:58 (1 999)). In addition to isolating QTL alleles in rice, other cereals, 
and other monocot and dicot crop species, the sequences of the invention can also be used to isolate 
alleles from the corresponding QTL(s) of wild relatives. Transgenic plants having various 
combinations of QTL alleles can then be created and the effects of the combinations measured. 
Once an ideal allele combination has been identified, crop improvement can be accomplished either 

20 through biotechnological means or by directed conventional breeding programs. (Flowers et al y J 
ExpBot 51:99 (2000); Tanksley and McCouch, Sc/ence 277: 1063 (1997)). 

Example 16: Marker-Assisted Breeding 

Markers or genes associated with specific desirable or undesirable traits are known and used 
25 in marker assisted breeding programs. It is particularly beneficial to be able to screen large numbers 
of markers and large numbers of candidate parental plants or progeny plants. The methods of the 
invention allow high volume, multiplex screening for numerous markers from numerous individuals 
simultaneously. 



- 188- 



WO 03/000905 



PCT/IB02/02450 



Markers or genes associated with specific desirable or undesirable traits are known and 
used in marker assisted breeding programs. It is particularly beneficial to be able to screen large 
numbers of markers and large numbers of candidate parental plants or progeny plants. The methods 
of the invention allow high volume, multiplex screening for numerous markers from numerous 

5 individuals simultaneously. 

A multiplex assay is designed providing SSRs specific to each of the markers of interest. The 
SSRs are linked to different classes of beads. All of the relevant markers may be expressed genes, 
so RNA or cDNA techniques are appropriate. RNA is extracted from root tissue of 1000 different 
individual plants and hybridized in parallel reactions with the different classes of beads. Each class of 

10 beads is analyzed for each sample using a microfluidics analyzer. For the classes of beads 

corresponding to qualitative traits, qualitative measures of presence or absence of the target gene are 
recorded. For the classes of beads corresponding to quantitative traits, quantitative measures of 
gene activity are recorded. Individuals showing activity of all of the qualitative genes and highest 
expression levels of the quantitative traits are selected for further breeding steps. In procedures 

15 wherein no individuals have desirable results for all the measured genes, individuals having the most 
desirable, and fewest undesirable, results are selected for further breeding steps. In either case, 
progeny are screened to further select for homozygotes with high quantitative levels of expression of 
the quantitative traits. 

20 Example 17: Method of modifying the zene frequency 

The invention further provides a method of modifying the frequency of a gene in a plant 
population, including the steps of: identifying an SSR within a coding region of a gene; screening a 
plurality of plants using the SSR as a marker to determine the presence or absence of the gene in an 
25 individual plant; selecting at least one individual plant for breeding based on the presence or absence 
of the gene; and breeding at least one plant thus selected to produce a population of plants having a 
modified frequency of the gene. The identification of the SSR within the coding region of a gene can 
be accomplished based on sequence similarity between the nucleic acid molecules of the invention 
and the region within the gene of interest flanking the SSR. 
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Supporting TABLES 

Table 12 : This table illustrates the 
5 correlation between rice sequences in sub 
groups I and III that show homologies 
between 80% and 99.9% to each other 



Sub-Group II 
Sequences 
SEQ ID NO 


Sub-Group I 
Sequences 
SEQ ID NO 


513 


121, 123 


515 


333 


517 


441; 443 


519 


151 


521 


9 


523 


73 


525 


203 


527 


215 


529 


209 


531 


103 


533 


407 


535 


115 


537 


165 


539 


1 


541 


325 


543 


397 


545 


61 


547 


455 


549 


255 


551 


351 


553 


225 


555 


139 


557 


25 


559 


3 


561 


17 


563 


279 


565 


191 


567 


451 


569 


417 j 


571 


99;95;435 



573 


91;81 


575 


95;99 


577 


85 


579 


229:223 


581 


83 


583 


401;235 


I 585 


283 


587 


179 


589 


135 


591 


141 | 


595 


5 


597 


311 


599 


379 


601 


123; 121 


603 


335 


605 


287 


607 


161 


609 


69 


611 


177 


615 


413 


617 


143 


619 


251 


621 


331 


623 


375 


625 


67 


627 


387 


629 


81; 91 


631 


89 


633 


181 


635 


297 


637 


309 


639 


329 


641 


229 


593, 613 


221 
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Table 13 : This table illustrates the 
correlation between rice sequences in sub- 
groups I and II 

5 



10 



Table 14: Description of "Grain Filling" 
15 QTLs identified in Tables 2 and 3 

QTL: OS-AE-1-1 

Species: Oryza sativa 
20 General Trait: DEVELOPMENT 

Specific Trait: Allelopathic effect 

Citation: BREEDING SCIENCE (2001) 
51:47-51 

Chromosome: 1 
25 Flanking Markers(s): 

QTL: OS-AE-11-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
30 Specific Trait: Allelopathic effect 

Citation: BREEDING SCIENCE (2001) 

51:47-51 
Chromosome: 1 1 
Flanking Markers(s): 

35 

QTL: OS-AE-12-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Allelopathic effect 
40 Citation: BREEDING SCIENCE (2001) 
51:47-51 
Chromosome: 12 
Flanking Markers(s): 



155 


507 


191 


503 


89 


509 


187 


505 


299 


501 


447 


511 



QTL: OS- AE- 5-1 

50 Species: Oryza sativa 

General Trait: DEVELOPMENT 
Specific Trait* Allelopathic effect 
Citation: BREEDING SCIENCE (2001) 
51:47-51 

55 Chromosome: 5 

Flanking Markers(s): 

QTL: OS- AMY- 5-1 

Species: Oryza sativa 
60 General Trait: QUALITY 

Specific Trait: Amylose content 

Citation: THEOR APPL GENET (1999) 
98:502-508 

Chromosome: 5 
65 Flanking Markers(s): 

QTL: OS-AMY-6-1 
Species: Oryza sativa 
General Trait: QUALITY 
70 Specific Trait: Amylose content 

Citation: THEOR APPL GENET (1999) 

99:642-648 
Chromosome: 6 
Flanking Markers(s): 
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QTL: OS-AMY-6-2 
Species: Oryza sativa 
General Trait; QUALITY 
5 Specific Trait. Amylose content 

Citation: THEOR APPL GENET (1999) 

98:502-508 
Chromosome: 6 
Flanking Markers(s): 

10 

QTL: OS-APDF-9-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Albino plantlet differentiation 
15 frequency 

Citation: MOLECULAR BREEDING (1998) 

4:165-172 
Chromosome: 9 
Flanking Markers(s): 

20 

QTL: OS-ASS-6-1 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Alkali spreading score 
25 Citation: THEOR APPL GENET (1 999) 
98:502-508 
Chromosome: 6 
Flanking Markers(s): 

30 QTL: OS-ASS-6-2 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Alkali spreading score 
Citation: THEOR APPL GENET (1999) 

35 98:502-508 
Chromosome: 6 
Flanking Markers(s): 

QTL: OS-BDV-M 
40 Species: Oryza sativa 

General Trait: QUALITY 
Specific Trait: Breakdown viscosity 
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Citation: THEOR APPL GENET (2000) 
100:280-284 
45 Chromosome: 1 

Flanking Markers(s): 

QTL: OS-BDV-6-1 

Species: Oryza sativa 
50 General Trait: QUALITY 

Specific Trait: Breakdown viscosity 

Citation: THEOR APPL GENET (2000) 
100:280-284 

Chromosome: 6 
55 Flanking Markers(s): 

QTL: OS-CHALK- 1-1 
Species: Oryza sativa 
General Trait: QUALITY 
60 Specific Trait: Grain chalkiness 

Citation: THEOR APPL GENET (2000) 

101:823-829 
Chromosome: 1 
Flanking Markers(s): 0 

65 

QTL: OS-CHALK- 10-1 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Grain chalkiness 
70 Citation: THEOR APPL GENET (2000) 
101:823-829 
Chromosome: 10 
Flanking Markers(s): 83.5 

75 QTL:OS-CHALK-6-l 
Species: Oryza sativa 
Genera] Trait: QUALITY 
Specific Trait: Grain chalkiness 
Citation: THEOR APPL GENET (2000) 

80 101:823-829 
Chromosome: 6 
Flanking Markers(s): 12.5 

QTL: OS-CIF-6-1 
85 Species: Oryza sativa 
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General Trait: DEVELOPMENT 
Specific Trait: Callus induction frequency 
Citation: MOLECULAR BREEDING (1998) 
4:165-172 
5 Chromosome: 6 
Flanking Markers(s): 

QTL: OS-CP V- 1-1 

Species: Oryza sativa 
10 General Trait: QUALITY 

Specific Trait: Cool paste viscosity 

Citation: THEOR APPL GENET (2000) 
100:280-284 

Chromosome: 1 
15 Flanking Markers(s): 

QTL: OS-CP V-6-1 
Species: Oryza sativa 
General Trait: QUALITY 
20 Specific Trait: Cool paste viscosity 

Citation: THEOR APPL GENET (2000) 

100:280-284 
Chromosome: 6 
Flanking Markere(s): 

25 

QTL: OS-CPV-6-2 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Cool paste viscosity 
30 Citation: THEOR APPL GENET (2000) 
100:280-284 
Chromosome: 6 
Flanking Markers(s): 

35 QTL: OS-CSV- 1-1 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Consistency viscosity 
Citation: THEOR APPL GENET (2000) 

40 100:280-284 
Chromosome: 1 
Flanking Markers(s): 
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QTL: OS-CS V-6-1 

45 Species: Oryza sativa 

General Trait: QUALITY 
Specific Trait: Consistency viscosity 
Citation: THEOR APPL GENET (2000) 
100:280-284 

50 Chromosome: 6 

Flanking Maikers(s): 

QTL: OS-CSV-6-2 

Species: Oryza sativa 
55 General Trait: QUALITY 

Specific Trait: Consistency viscosity 

Citation: THEOR APPL GENET (2000) 
100:280-284 

Chromosome: 6 
60 Flanking Markers(s): 

QTL: OS-DM-6-1 
Species: Oryza sativa 
General Trait: YIELD 
65 Specific Trait: Dry Mass 

Citation: PLANT PHYSIOLOGY (2001) 

125:406-422 
Chromosome: 6 
Flanking Markers(s): 16.7 

70 

QTL: OS-FLLEN-3-1 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Source- sink capacity 
75 Citation: MOLECULAR BREEDING (1998) 
4:419-426 
Chromosome: 2 
Flanking Markers(s): 160 

80 QTL:OS-FLLEN-9-l 

Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Source-sink capacity 

Citation: MOLECULAR BREEDING (1998) 
85 4:419-426 

Chromosome: 4 
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Flanking Maikers(s): 

QTL: OS-FLWID-3-1 

Species: Oryza sativa 
5 General Trait: YIELD 

Specific Trait: Source-sink capacity 

Citation: MOLECULAR BREEDING (1998) 
4:419-426 

Chromosome: 8 
10 Flanking Markers(s): 

QTL: OS-GC-2-1 
Species: Oryza sativa 
General Trait: QUALITY 
15 Specific Trait: Gel consistency 

Citation: THEOR APPL GENET (1999) 

98:502-508 
Chromosome: 2 
. Flanking Markers(s): 

20 

QTL: OS-GC-6-1 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Gel consistency 
25 Citation: THEOR APPL GENET (1 999) 
99:642-648 
Chromosome: 6 
Flanking Markers(s): 

30 QTL:OS-GP-l-l 

Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Grains per panicle 

Citation: THEOR APPL GENET (2000) 
35 101:248-254 

Chromosome: 1 

Flanking Markers(s): 

QTL: OS-GP-6-1 
40 Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Grains per panicle 



Citation: THEOR APPL GENET (2000) 
101:248-254 
45 Chromosome: 6 

Flanking Markers(s): 

QTL: OS-GPDF-1-I 
Species: Oryza sativa 
50 General Trait: DEVELOPMENT 

Specific Trait: Green plantlet differentiation 
frequency 

Citation: MOLECULAR BREEDING (1998) 
4:165-172 
55 Chromosome: 1 

Flanking Maikers(s): 

QTL: OS-GPL-1-1 
Species: Oryza sativa 
60 General Trait: YIELD 

Specific Trait: Grains per plant 
Citation: GENETICS (1998) 150:899-909 
Chromosome: 1 
Flanking Maikers(s): 

65 

QTL: OS-GPL-2-1 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Grains per plant 
70 Citation: GENETICS (1998) 150:899-909 
Chromosome: 2 
Flanking Markers(s): 

QTL: OS-GPL-4-1 
75 Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Grains per plant 

Citation: GENETICS (1998) 150:899-909 

Chromosome: 4 
80 Flanking Markers(s): 

QTL: OS-GPL-8-2 
Species: Oryza sativa 
General Trait: YIELD 
85 Specific Trait: Grains per plant 
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Citation: GENETICS (1998) 150:899-909 
Chromosome: 8 
Flanking Markers(s): 

5 QTL:OS-GPP-4-l 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Grains per panicle 
Citation: GENETICS (1998) 150:899-909 
10 Chromosome: 4 

Flanking Markers(s): 

QTL: OS-GPP-8-2 
Species: Oryza sativa 
15 General Trait: YIELD 

Specific Trait: Grains per panicle 
Citation: GENETICS (1998) 150:899-909 
Chromosome: 8 
Flanking Markers(s): 

20 

QTL: OS-GPYF-1-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Green plantlet yield frequency 
25 Citation: MOLECULAR BREEDING (1 998) 
4:165-172 
Chromosome: 1 
Flanking Markers(s): 

30 QTL:OS-GW-l-2 

Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: 1000 grain weight 

Citation: THEOR APPL GENET (2001) 
35 1 02:41-52 

Chromosome: 1 

Flanking Markers(s): 

QTL: OS-GW-3-1 
40 Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Grain weight - 1000 grains 
Citation: GENETICS (1998) 150:899-909 
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Chromosome: 3 
45 Flanking Markers(s): 

QTL: OS-GW-3-1 
Species: Oryza sativa 
General Trait: YIELD 
50 Specific Trait: Grain weight 

Citation: THEOR APPL GENET (2000) 

101:248-254 
Chromosome: 3 
Flanking Markers(s): 

55 

QTL: OS-GW-3-1 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: 1000 grain weight 
60 Citation: THEOR APPL GENET (2001) 
102:41-52 
Chromosome: 3 
Flanking Markers(s): 

65 QTL:OS-GW-5-l 

Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Grain weight - 1000 grains 

Citation: GENETICS (1998) 150:899-909 
70 Chromosome: 5 

Flanking Markers(s): 

QTL: OS-GW-5-1 

Species: Oryza sativa 
75 General Trait: YIELD 

Specific Trait: Grain weight 

Citation: THEOR APPL GENET (2000) 
101:248-254 

Chromosome: 5 
80 Flanking Markers(s): 

QTL: OS-GW-9-1 
Species: Oryza sativa 
General Trait: YIELD 
85 Specific Trait: Grain weight - 1 000 grains 
Citation: GENETICS (1998) 150:899-909 
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Chromosome: 9 
Flanking Markers(s): 

QTL: OS-GW 100-4-1 
5 Species: Oiyza sativa 

General Trait: YIELD 

Specific Trait: Grain weight - 100 grains 

Citation: THEOR APPL GENET (1998) 
96:957-963 
10 Chromosome: 4 

Flanking Markers(s): 100 



QTL: OS-GYLD-1-1 

Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Grain yield - tons/ha 

Citation: GENETICS (1998) 150:899-909 

Chromosome: 1 

Flanking Markers(s): 



15 



20 



QTL: OS-GYLD-2-1 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Grain yield - tons/ha 
25 Citation: GENETICS (1998) 150:899-909 
' Chromosome: 2 
Flanking Markers(s): 

QTL: OS-GYLD-4-1 
30 Species: Oryza sativa 

General Trait: YIELD 

Specific Trait: Grain yield - tons/ha 

Citation: GENETICS (1998) 150:899-909 

Chromosome: 4 
35 Flanking Markers(s): 

QTL: OS-GYLD-8-2 
Species: Oryza sativa 
General Trait: YIELD 
40 Specific Trait: Grain yield - tons/ha 

Citation: GENETICS (1998) 150:899-909 
Chromosome: 8 
Flanking Markers(s): 



45 QTL:OS-HPV-6-l 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Hot paste viscosity 
Citation: THEOR APPL GENET (2000) 

50 100:280-284 
Chromosome: 6 
Flanking Markers(s): 

QTL: OS-HPV-6-2 

55 Species: Oryza sativa 

General Trait: QUALITY 
Specific Trait: Hot paste viscosity 
Citation: THEOR APPL GENET (2000) 
100:280-284 

60 Chromosome: 6 

Flanking Markers(s): 

QTL: OS-PGWC-8-1 
Species: Oryza sativa 
65 General Trait: QUALITY 

Specific Trait: Percentage of grain with white 
core 

Citation: THEOR APPL GENET (1999) 
98:502-508 
70 Chromosome: 8 

Flanking Markers(s): 

QTL: OS-REGEN-3-1 

Species: Oryza sativa 
75 General Trait: DEVELOPMENT 

Specific Trait: Regeneration ability 

Citation: THEOR APPL GENET ( 1 999) 
98:243-251 

Chromosome: 3 
80 Flanking Markers(s): 9 

QTL: OS-RGT-2-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
85 Specific Trait: Reproductive growth time 
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Citation: THEOR APPL GENET (2001) 

102:1236-1242 
Chromosome: 2 
Flanking Markers(s): 

5 

QTL: OS-RGT-5-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Reproductive growth time 
10 Citation: THEOR APPL GENET (2001) 
102:1236-1242 
Chromosome: 5 
Flanking Markeis(s): 

15 QTL:OS-SBV-l-l 
Species: Oryza sativa 
General Trait: QUALITY 
Specific Trait: Setback viscosity 
Citation: THEOR APPL GENET (2000) 

20 100:280-284 
Chromosome: 1 
Flanking Markers(s): 

QTL: OS-SBV-6-1 

25 Species: Oryza sativa 

General Trait: QUALITY 
Specific Trait: Setback viscosity 
Citation: THEOR APPL GENET (2000) 
100:280-284 

30 Chromosome: 6 

Flanking Markers(s): 

QTL: OS-VGT-2-1 

Species: Oryza sativa 
35 General Trait: DEVELOPMENT 

Specific Trait: Vegetative growth time 

Citation: THEOR APPL GENET (2001) 
102:1236-1242 

Chromosome: 2 
40 Flanking Markers(s): 

QTL: OS-VGT-2-2 
Species: Oryza sativa 



General Trait: DEVELOPMENT 
45 Specific Trait: Vegetative growth time 
Citation: THEOR APPL GENET (2001) 

102:1236-1242 
Chromosome: 2 
Flanking Markers(s): 

50 

QTL: OS-VGT-5-1 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Vegetative growth time 
55 Citation: THEOR APPL GENET (2001) 
102:1236-1242 
Chromosome: 5 
Flanking Markers(s): 

60 QTL:OS-VGT-9-l 
Species: Oryza sativa 
General Trait: DEVELOPMENT 
Specific Trait: Vegetative growth time 
Citation: THEOR APPL GENET (2001) 

65 102:1236-1242 
Chromosome: 9 
Flanking Maikers(s): 

QTL: OS-WC-6-1 

70 Species: Oryza sativa 

General Trait: QUALITY 
Specific Trait: Grain white core 
Citation: THEOR APPL GENET (2000) 
101:823-829 

75 Chromosome: 6 

Flanking Markers(s): 13.5 

QTL: OS-Y-6-1 
Species: Oryza sativa 
80 General Trait: YIELD 
Specific Trait: Yield 

Citation: THEOR APPL GENET (2000) 

101:248-254 
Chromosome: 6 
85 Flanking Markers(s): 
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QTL: OS-YLD-1-1 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Yield 
5 Citation: THEOR APPL GENET (2001) 
102:41-52 
Chromosome: 1 
Flanking Markers(s): 

10 QTL:OS-YLD-5-l 
Species: Oryza sativa 
General Trait: YIELD 
Specific Trait: Yield 

Citation: THEOR APPL GENET (2001) 
15 102:793-800 
Chromosome: 5 
Flanking Markers(s): 

QTL:ZM-BIOM-3-i 
20 Species: Zea mays 

General Trait: YIELD 

Specific Trait: "Biomass, above ground" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
25 Chromosome: 3 

Flanking Markers(s): "UMC3,UMC96" 

QTL:ZM-B10M-5-l 

Species: Zea mays 
30 General Trait: YIELD 

Specific Trait: "Biomass, above ground" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 

Chromosome: 5 
35 Flanking Markers(s): UMC 1 66 

QTL: ZM-BIOM-7-1 
Species: Zea mays 
General Trait: YIELD 
40 Specific Trait: "Biomass, above ground" 
Citation: THEOR APPL GENET (1999) 

99:1106-1119 
Chromosome: 7 



Flanking Markers(s): "UMC1 16,BNL14.07" 

45 

QTL: ZM-BlOM-8-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: "Biomass, above ground" 
50 Citation: THEOR APPL GENET (1999) 
99:1106-1119 
Chromosome: 8 

Flanking Markers(s): "UMC138L,UMC12" 

55 QTL:ZM-CL-9-l 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Cellulose content 

Citation: THEOR APPL GENET (2001) 
60 102:591-599 

Chromosome: 9 

Flanking Markers(s): 

QTL:ZM-CPC-l-2 
65 Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 
70 Flanking Markers(s): UMC76 

QTL: ZM-CPC-1-2 
Species: Zea mays 
General Trait: QUALITY 
75 Specific Trait: Crude protein content 

Citation: CROP SCI (2001) 41:690-697 
Chromosome: 1 
Flanking Markers(s): 224 

80 QTL:ZM-CPC-l-3 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 
85 Chromosome: 1 

Flanking Markers(s): UMC58 
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QTL: ZM-CPC-1-4 

Species: Zea mays 

Genera) Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 

Flanking Markers(s): UMC128 

QTL: ZM-CPC-1-5 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (3998) 38:1278-1289 

Chromosome: 1 

Flanking Markers): UMC67 

QTL: ZM-CPC-1-6 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 

Flanking Markers(s): UMC83 

QTL: ZM-CPC-10-1 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 10 

Flanking Markers(s): UMC130 

QTL: ZM-CPC-3-1 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 3 

Flanking Markers(s): UMC154 

QTL: ZM-CPC-3-2 
Species: Zea mays 



General Trait: QUALITY 
45 Specific Trait: Crude protein concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 3 

Flanking Markere(s): BNL1 .297 

50 QTL: ZM-CPC-3-3 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 
55 Chromosome: 3 

Flanking Markers(s): UMCJ0 

QTL:ZM-CPC-5-l 
Species: Zea mays 
60 General Trait: QUALITY 

Specific Trait: Crude protein concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 5 
Flanking Markers(s): BNL6.22 

65 

QTL: ZM-CPC-6-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Crude protein concentration 
70 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 6 
Flanking Markers(s): UMC85 

QTL: ZM-CPC-7-2 
75 Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Crude protein concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 7 
80 Flanking Markers(s): UMC98B 

QTL:ZM-CPC-7-3 
Species: Zea mays 
General Trait: QUALITY 
85 Specific Trait: Crude protein concentration 
Citation: CROP SCI (1998) 38:1278-1289 
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Chromosome: 7 

Flanking Markers(s): UMC56 

QTL: ZM-CPC-8-1 
5 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Crude protein concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 8 
10 Flanking Markers(s): UMC71 

QTL: ZM-DMC-1-1 
Species: Zea mays 
General Trait: YIELD 
1 5 Specific Trait: Dry matter concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 1 
Flanking Markers(s): UMC33 

20 QTL: ZM-DMC-1-1 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: THEOR APPL GENET (2001) 
25 102:230-243 

Chromosome: 1 

Flanking Markers(s): 

QTL: ZM-DMC-1-2 
30 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 
35 Flanking Markers(s): UMC 1 28 

QTL: ZM-DMC- 10-1 
Species: Zea mays 
General Trait: YIELD 
40 Specific Trait: Dry matter concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 10 
Flanking Markers(s): UMC146 



45 QTL: ZM-DMC- 1 0- 1 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: THEOR APPL GENET (2001) 
50 102:230-243 

Chromosome: 10 

Flanking Markers(s): 

QTL: ZM-DMC- 10-2 
55 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 10 
60 Flanking Markers(s): UMC 1 46 

QTL: ZM-DMC-2-3 
Species: Zea mays 
General Trait YIELD 
65 Specific Trait: Dry matter concentration 
Citation: THEOR APPL GENET (2001) 

102:230-243 
Chromosome: 2 
Flanking Markers(s): 

70 

QTL: ZM-DMC-5-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter concentration 
75 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 5 
Flanking Markers(s): UMC68 

QTL: ZM-DMC-5-1 

80 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter content 
Citation: CROP SCI (2001) 41:690-697 
Chromosome: 5 

85 Flanking Markers(s): 1 16 
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QTL: ZM-DMC-5-1 
Species: Zea mays 
Genera] Trait: YIELD 
Specific Trait: Dry matter concentration 
5 Citation: THEOR APPL GENET (2001 ) 
102:230-243 
Chromosome: 5 
Flanking Markers(s): 

10 QTL:ZM-DMC-6-l 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 
15 Chromosome: 6 

Flanking Markers(s): UMC85 

QTL: ZM-DMC-6-1 

Species: Zea mays 
20 General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: THEOR APPL GENET (2001) 
102:230-243 

Chromosome: 6 
25 Flanking Markers(s): 

QTL: ZM-DMC-6-2 
Species: Zea mays 
General Trait: YIELD 
30 Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 6 

Flanking Markers(s): UMC59 

35 QTL:ZM-DMC-8-l 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 
40 Chromosome: 8 

Flanking Markers(s): UMC1 17 

QTL:ZM-DMC-8-l 



Species: Zea mays 
45 General Trait: YIELD 

Specific Trait: Dry matter content 
Citation: CROP SCI (2001) 41:690-697 
Chromosome: 8 
Flanking Markers(s): 132 

50 

QTL:ZM-DMC-8-l 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter concentration 
55 Citation: THEOR APPL GENET (200 1 ) 
102:230-243 
Chromosome: 8 
Flanking Maikers(s): 

60 QTL: ZM-DMC-8-2 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter concentration 

Citation: CROP SCI (1998) 38:1278-1289 
65 Chromosome: 8 

Flanking Markers(s): UMC71 

QTL: ZM-DMC-8-2 
Species: Zea mays 
70 General Trait: YIELD 

Specific Trait: Dry matter content 
Citation: CROP SCI (2001) 41:690-697 
Chromosome: 8 
Flanking Markers(s): 176 

75 

QTL: ZM-DMY-1-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
80 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 1 

Flanking Markers(s): UMC167 

QTL:ZM-DMY-l-3 
85 Species: Zea mays 
General Trait: YIELD 



•201- 



WO 03/000905 



PCTYEB02/02450 



Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 

Flanking Markers(s): UMC83A 

5 

QTL: ZM-DMY-1-4 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
10 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 1 
Flanking Markers(s): BNL5.59 

QTL: ZM-DMY-1-5 
15 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 
20 Flanking Markers(s): UMC83 

QTL: ZM-DMY- 10-1 
Species: Zea mays 
General Trait: YIELD 
25 Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 10 

Flanking Markers(s): UMC64 

30 QTL: ZM-DMY- 10-1 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (2001) 41:690-697 
35 Chromosome: 10 

Flanking Markers(s): 56 

QTL: ZM-DMY-2-1 
Species: Zea mays 
40 General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 2 



Flanking Markers(s): UMC53 

45 

QTL: ZM-DMY-2-3 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
50 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 2 
Flanking Markers(s): UMC4 

QTL: ZM-DMY-2-4 
55 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 2 
60 Flanking Markers(s): UMC36 

QTL: ZM-DMY-3-1 
Species: Zea mays 
General Trait: YIELD 
65 Specific Trait Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 3 

Flanking Markers(s): BNL6.16 

70 QTL: ZM-DMY-3-2 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 
75 Chromosome: 3 

Flanking Markers(s): UMC154 

QTL: ZM-DMY-3-3 
Species: Zea mays 
80 General Trait: YIELD 

Specific Trait: Diy matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 3 

Flanking Markers(s): UMC10 

85 

QTL: ZM-DMY-4-1 
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Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
Citation: CROP SCI (1998) 38:1278-1289 
5 Chromosome: 4 

Flanking Markers(s): UMC31 

QTL: ZM-DMY-4-2 
Species: Zea mays 
10 General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 4 

Flanking Markers(s): BNL7.65 

15 

QTL: ZM-DMY-4-3 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
20 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 4 
Flanking Markers(s): UMC42 

QTL: ZM-DMY-4-4 
25 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 4 
30 Flanking Markers(s): UMC1 27B 

QTL: ZM-DMY-5-1 
Species: Zea mays 
General Trait: YIELD 
35 Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 5 

Flanking Markers(s): BNL7.71 

40 QTL:ZM-DMY-8-l 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 



Citation: CROP SCI (1998) 38:1278-1289 
45 Chromosome: 8 

Flanking Markers(s): UMC120 

QTL: ZM-DMY-8-1 
Species: Zea mays 
50 General Trait: YIELD 

Specific Trait Dry matter yield 
Citation: CROP SCI (2001) 41:690-697 
Chromosome: 8 
Flanking Markers(s): 172 

55 

QTL: ZM-DMY-8-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Dry matter yield 
60 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 8 

Flanking Markers(s): UMC12A 

QTL: ZM-DMY-9-1 
65 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Dry matter yield 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 9 
70 Flanking Markers(s): UMC95 

QTL: ZM-EWT-2-1 
Species: Zea mays 
General Trait: YIELD 
75 Specific Trait: Ear weight 

Citation: THEOR APPL GENET ( 1 999) 

99:280-288 
Chromosome: 2 
Flanking Markers(s): PHI083 

80 

QTL: ZM-EWT-4-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Ear weight 
85 Citation: THEOR APPL GENET ( 1 999) 
99:280-288 
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Chromosome: 4 

Flanking Markers(s): PH1093 

QTL: ZM-GWE-9-1 
5 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain weight per ear 
Citation: THEOR APPL GENET (2001) 
102:591-599 
10 Chromosome: 9 

Flanking Markers(s): 

QTL: ZM-GWM2-1-1 
Species: Zea mays 
15 General Trait: YIELD 

Specific Trait: "Yield, grain weight per square 
meter" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
20 Chromosome: 1 

Flanking Markers(s): M UMC163,UMC161" 

QTL: ZM-GWM2-I0-1 
Species: Zea mays 
25 General Trait: YIELD 

Specific Trait: "Yield, grain weight per square 
meter" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
30 Chromosome: 10 

Flanking Markers(s): "UMC146,UMC44" 

QTL: ZM-GWM2-3-1 
Species: Zea mays 
35 General Trait: YIELD 

Specific Trait: "Yield, grain weight per square 
meter" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
40 Chromosome: 3 

Hanking Markers(s): "UMC92,UMC10" 

QTL: ZM-GWM2-3-2 



Species: Zea mays 
45 General Trait: YIELD 

Specific Trait: "Yield, grain weight per square 
meter" 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
50 Chromosome: 3 

Flanking Markers(s): "UMC3,UMC96" 

QTL:ZM-GWM2-7-l 
Species: Zea mays 
55 General Trait: YIELD 

Specific Trait: "Yield, grain weight per square 
meter" 

Citation: THEOR APPL GENET (1 999) 
99:1106-1119 
60 Chromosome: 7 

Flanking Markers(s): "BNL15.40,UMC116" 

QTL: ZM-GYHA-1-1 
Species: Zea mays 
65 General Trait: YIELD 

Specific Trait: Grain yield per hectare 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 1 
Flanking Markers(s): 

70 

QTL: ZM-GYHA-1-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield per hectare 
75 Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 1 
Flanking Markers(s): 

QTL: ZM-GYHA-1-3 
80 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield per hectare 

Citation: CROP SCI (1998) 38:1296-1308 

Chromosome: 1 
85 Flanking Markers(s): 
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QTL: ZM-GYHA-1-4 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield per hectare 
5 Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 1 
Flanking Markers(s): 

QTL: ZM-GYHA-3-1 
10 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield per hectare 

Citation: CROP SCI (1998) 38:1296-1308 

Chromosome: 3 
15 Flanking Markers(s): 

QTL: ZM-GYHA-5-1 
Species: Zea mays 
General Trait: YIELD 
20 Specific Trait: Grain yield per hectare 

Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 5 
Flanking Markers(s): 

25 QTL: ZM-GYHA-6- J 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield per hectare 

Citation: CROP SCI (1998) 38:1296-1308 
30 Chromosome: 6 

Flanking Markers(s): 

QTL: ZM-GYHA-8-1 
Species: Zea mays 
35 General Trait: YIELD 

Specific Trait: Grain yield per hectare 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 8 
Flanking Markers(s): 

40 

QTL: ZM-GYLD-1-1 
Species: Zea mays 
General Trait: YIELD 



Specific Trait: Grain yield 
45 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 1 
Flanking Markers(s): 

QTL: ZM-GYLD-1-2 
50 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 

Chromosome: 1 
55 Flanking Markers(s): 

QTL: ZM-GYLD-2-1 

Species: Zea mays 

General Trait: YIELD 
60 Specific Trait: Grain yield 

Citation: PLANT BREEDING (1998) 
117:193-202 

Chromosome: 2 

Flanking Markers(s): 
65 "CDOCMT202,CSU75C" 

QTL: ZM-GYLD-2-2 
Species: Zea mays 
General Trait: YIELD 
70 Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 2 
Flanking Markers(s): 

75 QTL: ZM-GYLD-2-3 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
80 Chromosome: 2 

Flanking Markers(s): 

QTL: ZM-GYLD-2-4 
Species: Zea mays 
85 General Trait: YIELD 
Specific Trait: Grain yield 
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Citation: CROP SCI (2000) 40:30-39 
Chromosome: 2 
Flanking Markers(s): 

5 QTL: ZM-GYLD-3-3 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
10 Chromosome: 3 

Flanking Markers(s): 

QTL: ZM-GYLD-4-1 
Species: Zea mays 
15 General Trait: YIELD 
Specific Trait: Grain yield 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 4 
Flanking Markers(s): 

20 

QTL: ZM-GYLD-5-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield 
25 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 5 
Flanking Maikers(s): 

QTL: ZM-GYLD-5-2 
30 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 

Chromosome: 5 
35 Flanking Markers(s): 

QTL: ZM-GYLD-5-3 
Species: Zea mays 
General Trait: YIELD 
40 Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 5 
Flanking Markers(s): 



45 QTL: ZM-GYLD-6-1 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: PLANT BREEDING (1998) 
50 117:193-202 

Chromosome: 6 

Flanking Markers(s): "CSU70,CDO580B" 

QTL: ZM-GYLD-6-2 
55 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 

Chromosome: 6 
60 Flanking Markers(s): 

QTL: ZM-GYLD-6-3 
Species: Zea mays 
General Trait: YIELD 
65 Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 6 
Flanking Markers(s): 

70 QTL: ZM-GYLD-6-4 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: CROP SCI (2000) 40:30-39 
75 Chromosome: 6 

Flanking Markers(s): 

QTL: ZM-GYLD-7-3 
Species: Zea mays 
80 General Trait: YIELD 
Specific Trait: Grain yield 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 7 
Flanking Markers(s): 

85 

QTL: ZM-GYLD-8-2 
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Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield 
Citation: CROP SCi (2000) 40:30-39 
5 Chromosome: 8 
Flanking Markers(s): 

QTL: ZM-GYLD-9-1 
Species: Zea mays 
10 General Trait: YIELD 
Specific Trait: Grain yield 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 9 
Flanking Markers(s): 

15 

QTL: ZM-GYLD-9-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield 
20 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 9 
Flanking Markers(s): 

QTL: ZM-GYUI-9-1 
25 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Yield under com borer 
infestation 

Citation: THEOR APPL GENET (2000) 
30 101:907-917 
Chromosome: 9 
Flanking Markeis(s): 

QTL: ZM-GYUI-9-2 
35 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Yield under com borer 
infestation 

Citation: THEOR APPL GENET (2000) 
40 101:907-917 
Chromosome: 9 
Flanking Markers(s): 
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QTL: ZM-GYUP-1-1 
45 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Yield under com borer 
protection 

Citation: THEOR APPL GENET (2000) 
50 101:907-917 

Chromosome: 1 

Flanking Markers(s): 

QTL: ZM-GYUP-1-2 
55 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Yield under com borer 
protection 

Citation: THEOR APPL GENET (2000) 
60 101:907-917 

Chromosome: 1 

Flanking Markers(s): 

QTL:ZM-GYUP-9-l 
65 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Yield under com borer 
protection 

Citation: THEOR APPL GENET (2000) 
70 101:907-917 
Chromosome: 9 
Flanking Markers(s): 

QTL: ZM-GYUP-9-2 
75 Species: Zea mays 
General Trait: YIELD 
Specific Trait: Yield under com borer 
protection 

Citation: THEOR APPL GENET (2000) 
80 101:907-917 
Chromosome: 9 
Flanking Markers(s): 

QTL:ZM-HM-1 
85 Species: Zea mays 
General Trait: YIELD 
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Specific Trait: Harvest index 

Citation: THEOR APPL GENET (1999) 

99:1106-1119 
Chromosome: 1 
5 Flanking Markers(s): U UMC94 > UMC76" 



25 QTL:ZM-HI-3-l 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Harvest index 

Citation: THEOR APPL GENET (1999) 
30 99:1106-1119 

Chromosome: 3 

Flanking Markers(s): "UMC92,UMC10" 

QTL: ZM-HI-4-1 
35 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Harvest index 

Citation: THEOR APPL GENET (1999) 
99:1106-1119 
40 Chromosome: 4 

Flanking Maikers(s): M UMC28.1,UMC19" 

QTL:ZM-HI-7-l 



50 



QTL:ZM-Hl-l-2 
Species: Zea mays 
Genera] Trait: YIELD 
10 Specific Trait: Harvest index 

Citation: THEOR APPL GENET (1999) 

99:1106-1119 55 
Chromosome: 1 

Flanking Markers(s): M UMC163,UMC161" 

15 

QTL: ZM-HI-10-1 

Species: Zea mays 60 
General Trait: YIELD 
Specific Trait: Harvest index 
20 Citation: THEOR APPL GENET (1 999) 
99:1106-1119 
Chromosome: 10 65 
Flanking Markers(s): "UMC146,UMC44" 



Species: Zea mays 
45 General Trait: YIELD 

Specific Trait: Harvest index 

Citation: THEOR APPL GENET (1999) 

99:1106-1119 
Chromosome: 7 

Flanking Markers(s): "BNL15.40,UMC1 16" 

QTL: ZM-HI-8-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Harvest index 
Citation: THEOR APPL GENET (1999) 

99:1106-1119 
Chromosome: 8 

Flanking Markers(s): "UMCI38UUMC12" 

QTL: ZM-1D-10-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestibility of organic 
stover 

Citation: THEOR APPL GENET (2000) 
101:907-917 : * 
Chromosome: 10 
Flanking Markers(s): 

70 

QTL: ZM-ID-2-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestibility of organic 
stover 

Citation: THEOR APPL GENET (2000) 

101:907-917 
Chromosome: 2 
Flanking Markers(s): 

QTL: ZM-ID-5-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestibility of organic 
85 stover 



75 



80 
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Citation: THEOR APPL GENET (2000) 

101:907-917 45 
Chromosome: 5 
Flanking Markeis(s): 

5 

QTL: ZM-1D-5-2 

Species: Zea mays 50 
General Trait: QUALITY 
Specific Trait: In vitro digestibility of organic 
10 stover 

Citation: THEOR APPL GENET (2000) 

101:907-917 55 
Chromosome: 5 
Flanking Markers(s): 

15 

QTL: ZM-ID-8-1 

Species: Zea mays 60 
General Trait: QUALITY 
Specific Trait: In vitro digestibility of organic 
20 stover 

Citation: THEOR APPL GENET (2000) 

101:907-917 65 
Chromosome: 8 
Flanking Markers(s): 

25 

QTL: ZM-IVDOM-1-1 

Species: Zea mays 70 
General Trait: QUALITY 
Specific Trait In vitro digestible organic 
30 matter 

Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 1 75 
Flanking Markers(s): UMC76 

35 QTL:ZM-IVDOM-l-2 
Species: Zea mays 

General Trait: QUALITY 80 
Specific Trait: In vitro digestible organic 
matter 

40 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 1 

Flanking Markers(s): UMC58 85 



QTL: ZM-IVDOM-1-3 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 

Flanking Markers(s): UMC167 

QTL:ZM-IVDOM-l-4 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 1 

Flanking Markers(s): UMC37 

QTL: ZM-IVDOM-10-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 10 

Flanking Markers(s): UMC130 

QTL: ZM-IVDOM- 10-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 10 

Flanking Markers(s): UMC18 

QTL: ZM-IVDOM-3-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 3 
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Flanking Markers($): UMC97 

QTL: ZM-I VDOM-3-3 
Species: Zea mays 
5 General Trait: QUALITY 

Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 3 
10 Flanking Markers(s): UMC97 

QTL: ZM-IVDOM-5-1 
Species: Zea mays 
General Trait: QUALITY 
15 Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 5 

Flanking Markers(s): UMC43 

20 

QTL: ZM-IVDOM-5-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible oiganic 
25 matter 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 5 

Flanking Markers(s): BNL7.71 

30 QTL: ZM-IVDOM-5-3 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 

35 Citation: CROP SCI ( 1 998) 38: 1 278- 1 289 
Chromosome: 5 
Flanking Markers(s): UMC90 

QTL: ZM-IVDOM-9-1 
40 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: In vitro digestible organic 
matter 



Citation: CROP SCI (1998) 38:1278-1289 
45 Chromosome: 9 

Flanking Markers(s): BNL5.09 

QTL: ZM-IVDOM-9-2 
Species: Zea mays 
50 General Trait: QUALITY 

Specific Trait: In vitro digestible organic 
matter 

Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 9 
55 Flanking Markers(s): BNL14.28 

QTL: ZM-KNE-4-1 
Species: Zea mays 
General Trait: YIELD 
60 Specific Trait: Kernel number per ear 
Citation: THEOR APPL GENET (1999) 

99:280-288 
Chromosome: 4 
Flanking Markers(s): PHI093 

65 

QTL: ZM-KW100-1-2 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 100 kernels 
70 Citation: THEOR APPL GENET (1999) 
99:1106-1119 
Chromosome: 1 

Flanking Markers(s): "UMC157,BNL8.29 M 

75 QTL: ZM-KW 100-3-1 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 100 kernels 
Citation: THEOR APPL GENET (1999) 
80 99:1106-1119 
Chromosome: 3 
Flanking Markers(s): UMC60 

QTL: ZM-KW 100-9-1 
85 Species: Zea mays 
General Trait: YIELD 
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Specific Trait: Kernel weight per 100 kernels 
Citation: THEOR APPL GENET (1999) 

99:1106-1119 
Chromosome: 9 
5 Flanking Markers(s): "UMC1 53,BNL5.09" 

QTL: ZM-KW300-1-2 
Species: Zea mays 
General Trait: YIELD 
10 Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 1 
Flanking Markers(s): 

15 QTL: ZM-KW300-3-2 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
20 Chromosome: 3 

Flanking Markers(s): 

QTL: ZM-KW300-3-3 
Species: Zea mays 
25 General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 3 
Flanking Markers(s): 

30 

QTL: ZM-KW300-4-2 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
35 Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 4 
Flanking Markers(s): 

QTL: ZM-KW300-5-1 
40 Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 



Chromosome: 5 
45 Flanking Markers(s): 

QTL: ZM-KW300-6-2 
Species: Zea mays 
General Trait: YIELD 
50 Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 6 
Flanking Markers(s): 

55 QTL: ZM-KW300-8-2 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
60 Chromosome: 8 

Flanking Markers(s): 

QTL:ZM-KW300-9-I 
Species: Zea mays 
65 General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 9 
Flanking Maikers(s): 

70 

QTL: ZM-KW300-9-2 
Species: Zea mays 
General Trait: YIELD 

Specific Trait: Kernel weight per 300 kernels 
75 Citation: CROP SCI (1998) 38:1296-1308 
Chromosome: 9 
Flanking Markers(s): 

QTL: ZM-KWE-4-1 
80 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Kernel weight per ear 

Citation: THEOR APPL GENET (1999) 
99:280-288 
85 Chromosome: 4 

Flanking Markers(s): PHI093 
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QTL: ZM-MOIST-1-1 
Species: Zea mays 
General Trait: QUALITY 
5 Specific Trait: Grain moisture 

Citation: CROP SCI (2000)40:30-39 
Chromosome: I 
Flanking Markers(s): 

10 QTL:ZM-MOIST-l-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000)40:30-39 

15 Chromosome: 1 

Flanking Maikers(s): 

QTL: ZM-MOIST-1-3 
Species: Zea mays 
20 General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000)40:30-39 
Chromosome: 1 
Ranking Markers(s): 

25 

QTL: ZM-MOIST-1-4 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
30 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 1 
Flanking Markers(s): 

QTL: ZM-MOIST-1-5 

35 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 1 

40 Flanking Markers(s): 

QTL: ZM-MOIST-1-6 
Species: Zea mays 



General Trait: QUALITY 
45 Specific Trait: Grain moisture 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 1 
Flanking Markers(s): 

50 QTL: ZM-MOIST- 10-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 

55 Chromosome: 10 
Flanking Markers(s): 

QTL: ZM-MOIST-2-1 
Species: Zea mays 
60 General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 2 
Flanking Markers(s): 

65 

QTL: ZM-MOIST-2-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
70 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 2 
Flanking Markers(s): 

QTL: ZM-MOIST-2-3 

75 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 2 

80 Flanking Markers(s): 

QTL: ZM-MOIST-3-2 
Species: Zea mays 
General Trait: QUALITY 
85 Specific Trait: Grain moisture 

Citation: CROP SCI (2000) 40:30-39 
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Chromosome: 3 
Flanking Markers(s): 

QTL: ZM-MOIST-3-3 

5 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000)40:30-39 
Chromosome: 3 

1 0 Flanking Markers(s): 

QTL: ZM-MOIST-4-2 
Species: Zea mays 
General Trait: QUALITY 
15 Specific Trait Grain moisture 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 4 
Flanking Markers(s): 

20 QTL: ZM-MOIST-4-3 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 

25 Chromosome: 4 

Flanking Markers(s): 

QTL: ZM-MOIST-4-4 
Species: Zea mays 
30 General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 4 
Flanking Markers(s): 

35 

QTL:ZM-M01ST-5-l 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
40 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 5 
Flanking Markers(s): 



QTL: ZM-MOIST-5-2 

45 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000)40:30-39 
Chromosome: 5 

50 Flanking Markers(s): 

QTL: ZM-MOIST-5-3 
Species: Zea mays 
General Trait: QUALITY 
55 Specific Trait: Grain moisture 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 5 
Flanking Markers(s): 

60 QTL: ZM-MOIST-5-4 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 

65 Chromosome: 5 

Flanking Markers(s): 

QTL: ZM-MOIST-6-2 
Species: Zea mays 
70 General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 6 
Flanking Markers(s): 

75 

QTL: ZM-MOIST-7-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
80 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 7 
Flanking Markers(s): 

QTL: ZM-MOIST-7-2 
85 Species: Zea mays 

General Trait: QUALITY 
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Specific Trait: Grain moisture 
Citation: CROP SCI (2000)40:30-39 
Chromosome: 7 
Flanking Markers(s): 

5 

QTL: ZM-MOIST-7-3 
Species: Zea mays 
General Trait QUALITY 
Specific Trait: Grain moisture 
10 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 7 
Flanking Markers(s): 

QTL: ZM-MOIST-7-4 

15 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 7 

20 Flanking Markers(s): 

QTL: ZM-MOIST-8-1 
Species: Zea mays 
General Trait: QUALITY 
25 Specific Trait: Grain moisture 

Citation: CROP SCI (2000) 40:30-39 
Chromosome: 8 
Flanking Markers(s): 

30 QTL: ZM-MOIST-8-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 

35 Chromosome: 8 

Flanking Markers(s): 

QTL: ZM-MOIST-9-2 
Species: Zea mays 
40 General Trait: QUALITY 
Specific Trait: Grain moisture 
Citation: CROP SCI (2000) 40:30-39 
Chromosome: 9 



Flanking Markers(s): 

45 

QTL: ZM-MOIST-9-3 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Grain moisture 
50 Citation: CROP SCI (2000) 40:30-39 
Chromosome: 9 
Flanking Markers(s): 

QTL: ZM-PC-1-1 

55 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Protein concentration 
Citation: CROP SCI (1998) 38:1062-1072 
Chromosome: 1 

60 Flanking Maikers(s): 

"CSU92,CSUCMT11B M 

QTL: ZM-PC-1-2 
Species: Zea mays 
65 General Trait: QUALITY 

Specific Trait: Protein concentration 
Citation: CROP SCI (1998) 38:1062-1072 
Chromosome: 1 

Flanking Markers(s): "BNL8.29A3NL6.32** 

70 

QTL:ZM-PC-5-l 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Protein concentration 
75 Citation: CROP SCI (1998) 38:1062-1072 
Chromosome: 5 

Flanking Markere(s): "UMC51 A,UMC127B M 

QTL:ZM-PC-8-l 

80 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Protein concentration 
Citation: CROP SCI (1998) 38:1062-1072 
Chromosome: 8 

85 Flanking Markers(s): "CSU75D,CDO580A" 
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QTL: ZM-PC-9-1 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Protein concentration 
5 Citation: CROP SCI (1998) 38:1062-1072 
Chromosome: 9 

Flanking Markers(s): M CSU158,CSU147" 

QTL: ZM-PR-9-1 
10 Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Protein content 

Citation: THEOR APPL GENET (2001) 
102:591-599 
15 Chromosome: 9 

Flanking Maikers(s): 

QTL: ZM-STC-10-1 
Species: Zea mays 
20 General Trait: QUALITY 

Specific Trait: Starch concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 10 
Flanking Markers(s): UMC146 

25 

QTL: ZM-STC-10-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Starch concentration 
30 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 10 
Flanking Markers(s): UMC18 

QTL: ZM-STC-2-2 

35 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Starch concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 2 

40 Flanking Markers(s): UMC36 

QTL: ZM-STC-5-1 
Species: Zea mays 



General Trait: QUALITY 
45 Specific Trait: Starch concentration 

Citation: CROP SCI (1998) 38:1278-1289 

Chromosome: 5 

Flanking Markers(s): BNL5.40 

50 QTL: ZM-STC-5-1 

Species: Zea mays 

General Trait: QUALITY 

Specific Trait: Starch content 

Citation: CROP SCI (2001) 41:690-697 
55 Chromosome: 5 

Flanking Markers(s): 60 

QTL: ZM-STC-6-1 
Species: Zea mays 
60 General Trait: QUALITY 

Specific Trait: Starch concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 6 
Flanking Markers(s): UMC46 

65 

QTL: ZM-STC-7-2 
Species: Zea mays 
General Trait: QUALITY 
Specific Trait: Starch concentration 
70 Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 7 
Flanking Markers(s): UMC1 10 

QTL: ZM-STC-8-1 

75 Species: Zea mays 

General Trait: QUALITY 
Specific Trait: Starch concentration 
Citation: CROP SCI (1998) 38:1278-1289 
Chromosome: 8 

80 Flanking Markers(s): UMC 1 24 

QTL: ZM-STC-8-1 
Species: Zea mays 
General Trait: QUALITY 
85 Specific Trait: Starch content 

Citation: CROP SCI (2001) 41:690-697 
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Chromosome: 8 
Flanking Markers(s): 54 

QTL:ZM-TGW-4-l 
5 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Thousand grain weight 

Citation: THEOR APPL GENET (2001) 
102:591-599 
10 Chromosome: 4 

Flanking Markers(s): 

QTL: ZM-TGW-9-1 

Species: Zea mays 
15 General Trait: YIELD 

Specific Trait: Thousand grain weight 

Citation: THEOR APPL GENET (2001) 
102:591-599 

Chromosome: 9 
20 Flanking Markers(s): 

QTL: ZM-TGW-9-2 
Species: Zea mays 
General Trait: YIELD 
25 Specific Trait: Thousand grain weight 

Citation: THEOR APPL GENET (2001) 

102:591-599 
Chromosome: 9 
Flanking Maricers(s): 

30 

QTL: ZM-TW-1-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Test weight 
35 Citation: THEOR APPL GENET (2001) 
102:230-243 
Chromosome: 1 
Flanking Markers(s): 

40 QTL: ZM-TW- 10-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Test weight 



Citation: THEOR APPL GENET (2001) 
45 102:230-243 
Chromosome: 10 
Flanking Markers(s): 

QTL: ZM-TW-2-3 
50 Species: Zea mays 

General Trait: YIELD 

Specific Trait: Test weight 

Citation: THEOR APPL GENET (2001) 
102:230-243 
55 Chromosome: 2 

Flanking Markers(s): 

QTL:ZM-TW-5-l 

Species: Zea mays 
60 General Trait: YIELD 

Specific Trait: Test weight 

Citation: THEOR APPL GENET (2001) 
102:230-243 

Chromosome: 5 
65 Flanking Maikers(s): 

QTL: ZM-TW-8-1 
Species: Zea mays 
General Trait: YIELD 
70 Specific Trait: Test weight 

Citation: THEOR APPL GENET (2001) 

102:230-243 
Chromosome: 8 
Flanking Markers(s): 

75 

QTL: ZM-TW-9-1 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Test weight 
80 Citation: THEOR APPL GENET (200 1 ) 
102:230-243 
Chromosome: 9 
Flanking Maikers(s): 

85 QTL:ZM-VT-6-l 
Species: Zea mays 
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General Trait: QUALITY 
Specific Trait: Vitreousness 
Citation: THEOR APPL GENET (2001) 
102:591-599 
5 Chromosome: 6 
Flanking Markens(s): 

QTL: ZM-YLD-1-1 

Species: Zea mays 
10 General Trait: YIELD 

Specific Trait: Grain yield 

Citation: THEOR APPL GENET (2001) 
102:230-243 

Chromosome: 1 
1 5 Flanking Markers(s): 

QTL:ZM-YLD-2-l 
Species: Zea mays 
General Trait: YIELD 
20 Specific Trait: Grain yield 

Citation: THEOR APPL GENET (2001) 

102:230-243 
Chromosome: 2 
Flanking Markers(s): 

25 

QTL: ZM-YLD-2-2 
Species: Zea mays 
General Trait: YIELD 
Specific Trait: Grain yield 
30 Citation: THEOR APPL GENET (2001 ) 
102:230-243 



Chromosome: 2 
Flanking Markers(s): 

35 QTL:ZM-YLEM-1 

Species: Zea mays 

General Trait: YIELD 

Specific Trait: Grain yield 

Citation: THEOR APPL GENET (2001) 
40 102:230-243 

Chromosome: 4 

Flanking Markers(s): 

QTL: ZM-YLD-6-1 
45 Species: Zea mays 

General Trait: YIELD 

Specific Trait Grain yield 

Citation: THEOR APPL GENET (2001) 
102:230-243 
50 Chromosome: 6 

Flanking Markers(s): 

QTL: ZM-YLD-9-1 

Species: Zea mays 
55 General Trait: YIELD 

Specific Trait: Grain yield 

Citation: THEOR APPL GENET (2001) 
102:230-243 

Chromosome: 9 
60 Flanking Markers(s): 



Table 15: Swiss-Prot Data 



101 




Accession: 
PI 0538 


Swissprot_id: 
AMYB.SOYBN 


GLnumber 
231541 


Description: BETA- 
AMYLASE (1,4- 
ALPHA-D-GLUCAN 
MALTOHYDROLAS 
E) 


113 




Accession: 
Q9F234 


Swissprotjd: 
AGL2_BACTQ 


Gi_numben 
14423647 


Description: Alpha- 
glucosidase II 


1 




Accession: 


Swissprotjd: 


Gi_numben 


Description: MPV17 
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P39210 


MPV1_HUMAN 


730059 


protein 


317 




Accession: 
Q08759 


Swissprot_id: 
MYB_XENLA 


Gi_numben 
730090 


Description: Myb 
protein 


329 




Accession: 
P25822 


Swissprotjd: 
PUMJDROME 


Gi_numben 
131605 


Description: 
MATERNAL 
PUMILIO PROTEIN 


173 




Accession: 
P42862 


Swissprotjd: 
G6PA_ORYSA 


Gi_numben 
1169797 


Description: Glucose- 
6-phosphate 
isomerase, cytosolic A 
(GPI-A) 
(Phosphoglucose 
isomerase A) (PGI-A) 
(Phosphohexose 
isomerase A) (PHI-A) 


333 




Accession: 
P02582 


Swissprotjd: 
ACT1_MAIZE 


Gi_numben 
113220 


Description: ACTIN 1 


233 




Accession: 
P28968 


Swissprot_id: 
VGLX_HSVEB 


Gi_numben 
138350 


Description: 
GLYCOPROTEIN X 
PRECURSOR 


335 




Accession: 
Q05201 


Swissprotjd: 
EYA.DROME 


Gi_numben 
544271 


Description: 
DEVELOPMENTAL 
PROTEIN EYES 
ABSENT (PROTEIN 
CLIFT) 


119 




Accession: 
024301 


Swissprotjd: 
SUS2.PEA 


Gi_numben 
3915037 


Description: Sucrose 
synthase 2 (Sucrose- ! 
UDP 

glucosyltransferase 2) 


311 




Accession: 
PI 0290 


Swissprotjd: 
MYBCJMAIZE 


Gi_numben 
127585 


Description: 
Anthocyanin regulatory 
CI protein 


149 




Accession: 
PI 7784 


Swissprotjd: 
ALF_ORYSA 


Gi_numben 
113622 


Description: 

FRUCTOSE- 

BISPHOSPHATE 

ALDOLASE, 

CYTOPLASMIC 

ISOZYME 


155 




Accession: 
Q40677 


Swissprotjd: 
ALFC.ORYSA 


Gi_number 
3913018 


Description: 

FRUCTOSE- 

BISPHOSPHATE 

ALDOLASE, 

CHLOROPLAST 

PRECURSOR 
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(ALDP) 


143 




Accession: 
P46225 


Swissprotjd: 
TP1C_SECCE 


Gi_numben 
1174745 


Description: 
Triosephosphate 
isomerase, chloroplast 
precursor (TIM) 


307 




Accession: 
P42777 


Swissprotjd: 
GBF4_ARATH 


Gi_numben 
1169863 


Description: G-box 
binding factor 4 


341 




Accession: 
PI 6356 


Swissprotjd: 
RPBl.CAEEL 


Gi_number. 
133322 


Description: DNA- 
DIRECTED RNA 
POLYMERASE II 
LARGEST SUBUNIT 


193 




Accession: 
PI 2624 


Swissprotjd: 
MACS_BOVTN 


Gi_numben 
585447 


Description: 

MYRISTOYLATED 

ALANINE- RICH C- 

KINASE 

SI FR9TR ATF 

(MARCKS) 
(ACAMP-81) 


131 




Accession: 
Q43846 


Swissprotjd: 
UGS4J50LTU 


Gi_number. 
2833389 


Description: Soluble 
glycogen [starch] 
synthase, chloroplast 
precursor (SS 
III) 


199 




Accession: 
P08640 


Swissprotjd: 
AMYH_YEAST 


Gi_numben 
728850 


Description: 
GLUCOAMYLASE 

o i / oz. r i vLiV^ u i\u 

(GLUCAN 1,4- 
ALPHA- 

GLUCOSiDASE) 
( 1,4- ALPHA- D- 
GLUCAN 

GLUCOHYDROLAS 
E) 


343 




Accession: 
P28284 


Swissprotjd: 
ICP0_HSV2H 


Gi_numben 
124135 


Description: Trans- 
acting transcriptional 
protein 1CP0 
(VMW1 18 protein) 


287 




Accession: 
059800 


Swissprotjd: 
CWF5JSCHPO 


Gi_numben 
18202094 


Description: Cell cycle 
control protein cwf5 


191 




Accession: 
Q9ZT66 


Swissprotjd: 
E134JMAIZE 


Gi_numben 
8928122 


Description: Endo- 
l,3;l,4-beta-D- 
glucanase precursor 


215 




Accession: 


Swissprotjd: 


Gi_numben 


Description: 
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P07730 


GLU2JDRYSA 


121475 


GLUTELIN TYPE II 
PRECURSOR 


23 




Accession: 
043791 


Swissprotjd: 
SPOP_HUMAN 


Gi_numben 
8134708 


Description: Speckle- 
type POZ protein 


147 




Accession: 
P48494 


Swissprotjd: 
TPISJ3RYSA 


GLnumben 
1351270 


Description: 
Triosephosphate 
isomerase, cytosolic 
(TIM) 


347 




Accession: 
P37829 


Swissprotjd: 
SCRK_SOLTU 


Gi_number. 
585973 


Description: 
FRUCTOK1NASE 


157 




Accession: 
P32662 


Swissprotjd: 
GPH_ECOLI 


GLnumben 
418445 


Description: 
Phosphoglycolate 
phosphatase (PGP) 


349 




Accession: 
Q02910 


Swissprotjd: 
CPN_DROME 


GLnumben 
416833 


Description: 
CALPHOTIN 


139 




Accession: 
PI 2299 


Swissprotjd: 
GLG2_WHEAT 


GLnumben 
1707930 


Description: Glucose- 
1 -phosphate 
adenylyltransferase 
large subunit, 
chloroplast precursor 
(ADP- glucose 
synthase) (ADP- 
glucose 

pyrophosphorylase) 
(AGPASE S) (Alpha- 
D- glucose- 1- 
phosphate adenyl 
transferase) 


175 




Accession: 
P52178 


Swissprotjd: 
CPT2_BRAOL 


GLnumben 
1706110 


Description: Triose 
phosphate/phosphate 
translocator, non- green 

plastid, 
chloroplast precursor 
(CTPT) 


5 




Accession: 
P00434 


Swissprotjd: 
PERX BRARA 


GLnumben 
464365 


Description: 
Peroxidase P7 


351 




Accession: 
P38682 


Swissprotjd: 
GL03_YEAST 


GLnumben 
729595 


Description: ZINC 
FINGER PROTEIN 
GL03 


353 




Accession: 
P37829 


Swissprotjd: 
SCRK_SOLTU 


GLnumben 
585973 


Description: 
FRUCTOKINASE 


255 




Accession: 
Q02817 


Swissprot_id: 
MUC2 HUMAN 


GLnumben 
2506877 


Description: MUCIN 2 
PRECURSOR 
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(INTESTINAL 

1 1 J i ft M^^J 111*1 Hi 

MUCIN 2) 


75 




Accession: 
P07206 


Swissprot_id: 
PULA_KLEPN 


Gi_numben 
131589 


Description: Pullulanase 
precursor (Alpha- 
dpxtrin endo- 

UvAU III v««w 

1 ,6-aIpha-glucosidase) 
(Pullulan 6- 
glucanohydrolase) 


jj / 




P33479 


^ wi q Qnrnt id * 

1E18_PRVKA 


Ci\ numhpr* 

462387 


Devrintion* 
IMMEDIATE- 
EARLY PROTEIN 
IE 180 


359 




Accession: 
P08547 


Swissprot_id: 
LIN1_HUMAN 


Gi_numben 
126295 


Description: LINE-1 
REVERSE 
TRANSCRIPTASE 
HOMOLOG 


361 




Accession: 
P03211 


Swissprot_id: 
EBNI_EBV 


Gi_numben 
119110 


Description: EBNA-1 

NUCLEAR 

PROTEIN 






Accession. 
Q02817 


Vii/iccf\rr\t i/H* 

jwi aaprui_iu . 

MUC2_HUMAN 


VJl 11 HI 1 IUCI . 

2506877 


PRECURSOR 
{INTESTINAL 
MUCIN 2) 


365 




Accession: 
P08548 


Swissprot_id: 
LINl_NYCCO 


Gi_numben 
126296 


Description: LINE-1 
REVERSE 
TRANSCRIPTASE 
HOMOLOG 


181 




Accession: 
Q4U40 


Swissprot_id: 
PFPA_RlCCO 


Gi_numben 
2499488 


Description: 

PYROPHOSPHATE- 

-FRUCTOSE 6- 

PHOSPHATE1- 

PHOSPHOTRANSFE 

RASE 

ALPHA SUBUN1T 
(PFP) (6- 

PHOSPHOFRUCTO 
KINASE 

(PYROPHOSPHATE) 
) 

(PYROPHOSPHATE 
-DEPENDENT 






















6- 

PHOSPHOFRUCTO 



-221 - 



WO 03/000905 



PCT/IB02/02450 













SE-1 -KINASE) (PPI- 
PFK) 


367 




Accession: 
P43125 


Swissprot_id: 
RDGBJDROME 


Gi_numben 
1172875 


Description: RETINAL 

DEGENERATION B 

PROTEIN 

(PROBABLE 

CALCIUM 

TRANSPORTER 

RDGB) 


261 




Accession: 
Q59320 


Swissprot_id: 
KDSB.CHLTR 


Gi_numben 
7387818 


Description: 3- 
DEOXY-MANNO- 
OCTULOSONATE 
CYTIDYLYLTRANS 
FERASE (CMP-KDO 

SYNTHETASE) 

(CMP-2-KETO-3- 

DEOXYOCTULOSO 

NIC ACID 

SYNTHETASE) 

(CKS) 


221 




Accession: 
P55217 


Swissprot_id: 
METB_ARATH 


Gi_numben 
2507422 


Description: 
CYSTATHIONINE 
GAMMA- 
SYNTHASE, 
CHLOROPLAST 
PRECURSOR (CGS) 
(O- 

SUCCINYLHOMOS 

ERINE(THIOL> 

LYASE) 


57 




Accession: 
P09830 


Swissprot_id: 
ARAE_ECOLI 


Gi_numben 
114102 


Description: 

ARABINOSE- 

PROTON 

SYMPORTER 

(ARABINOSE 

TRANSPORTER) 


25 




Accession: 
Q9SYQ8 


Swissprot_id: 
CLV1_ARATH 


Gi_number. 
12643323 


Description: 
RECEPTOR 
PROTEIN KINASE 
CLAVATA1 
PRECURSOR 


369 




Accession: 


SwissproHd: 


Gi_numben 


Description: 
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WO 03/000905 



PCT7IB02/02450 







P06921 


VE2_HPV05 


1352839 


REGULATORY 
PROTEIN E2 


39 




Accession: 
Q9UQ13 


Swissprot_id: 
SH02_HUMAN 


Gi_number 
14423936 


Description: 
LEUCINE- RICH 
REPEAT PROTEIN 
SHOC-2 (RAS- 
BINDING PROTEIN 
SUR-8) 


87 




Accession: 
P27935 


Swissprot_id: 
AM2A_ORYSA 


Gi_number. 
113678 


Description: Alpha- 
amylase isozyme 2 A 
precursor ( 1 ,4-aIpha- 
D-glucan 
glucanohydrolase) 


371 




Accession: 
Q02921 


Swissprotjd: 
N093_SOYBN 


Gi_numben 
730165 


Description: EARLY 
NODULIN 93(N-93) 


163 




Accession: 
P52178 


Swissprot_id: 
CPT2_BRAOL 


Gi_numben 
1706110 


Description: Triose 
phosphate/phosphate 
translocator, non- green 

plastid, 
chloroplast precursor 
(CTPT) 


375 




Accession: 
P54069 


Swissprotjd: 
BE46_SCHPO 


Gi_numben 
12644312 


Description: BEM46 
PROTEIN 


315 




Accession: 
P20025 


Swissprot_id: 
MYB3_MA1ZE 


Gi_numben 
127582 


Description: Myb- 
related protein Zm38 


89 




Accession: 
P27934 


Swissprot_id: 
AM3E_ORYSA 


Gi_number. 
113683 


Description: ALPHA- 
AMYLASE 
ISOZYME 3E 
PRECURSOR (1,4- 
ALPHA-D-GLUCAN 

GLUCANOHYDROL 
ASE) 


289 




Accession: 
P37833 


Swissprotjd: 
AATC_ORYSA 


GLnumben 
584706 


Description: 
ASPARTATE 
AMINOTRANSFER 
ASE, 

CYTOPLASMIC 

(TRANSAMINASE 

A) 


49 




Accession: 
Q41144 


Swissprotjd: 
STC_RICCO 


Gi_number 
3915039 


Description: SUGAR 
CARRIER PROTEIN 
C 



-223- 



WO 03/000905 



PCT/IB02/02450 



153 




Accession: 
P21727 


Swissprot_id: 
CPTRJPEA 


Gi_mimben 
117290 


Description: TRIOSE 

PHOSPHATE/PHOS 

PHATE 

TRANSLOCATOR, 
CHLOROPLAST 
PRECURSOR 
(CTPT) (P36) (E30) 


81 




Accession: 
PI 7654 


Swissprot_id: 
AMYl_ORYSA 


Gi_number. 
113766 


Description: ALPHA- 
AMYLASE 
PRECURSOR (1,4- 
ALPHA-D-GLUCAN 

GLUCANOHYDROL 
ASE) (ISOZYME IB) 


379 




Accession: 
043516 


Swissprot_id: 
WA1P_HUMAN 


GLnumben 
13124642 


Description: 
WISKOTT- 
ALDRICH 

PROTEIN 
INTERACTING 
PROTEIN (WASP 

INTERACTING 
PROTEIN) (PRPL-2 
PROTEIN) 


305 




Accession: 
Q02516 


Swissprot_id: 
HAP5_YEAST 


Gi_numben 
2493550 


Description: 
TRANSCRIPTIONA 
L ACTIVATOR 
HAP5 ! 


381 




Accession* 
PI 0978 


Swissnrot id' 
POLX_TOBAC 


Cii number* 
130582 


Descrintion* 
Retro virus-related Pol 
polyprotein from 
transposon TNT 
1-94 [Contains: 
Protease ; Reverse 
transcriptase ; 
Endonuclease] 


197 




Accession: 
P01087 


Swissprot_id: 
1AAT_ELEC0 


Gi_numben 
2851515 


Description: Alpha- 
amylase/trypsin 
inhibitor (RBI) (RATI) 


45 




Accession: 
076082 


Svvissprot_id: 
OCN2_HUMAN 


Gi_number. 
8928257 


Description: Organic 
cation/camitine 
transporter 2 (Solute 



-224- 



WO 03/000905 



PCT/IB02/02450 













carrier family 
22 member 5^ CHieh- 
affinity sodium- 
dependent 

carnitine cotransporter) 


97 




Accession: 
Q01885 


Swissprotjd: 
RAG2.0RYSA 


GLnumben 
548671 


Description: SEED 
ALLERGENIC 
PROTEIN RAG2 
PRECURSOR 


383 




Accession: 
w iv/ 


Swissprotjd: 

KIN r D_JVlkJU on 


GLnumben 


Description: RING 

12 (LIM DOMAIN 
INTERACTING 
RING FINGER 
PROTEIN) (RING 
FINGER LIM 
DOMAIN-BINDING 
PROTEIN) (R-L1M) 


135 




Accession: 
P55241 


Swissprotjd: 
GLG1_MAIZE 


GLnumben 
1707924 


Description: Glucose- 
1 -phosphate 
adenylyltransferase 
large subunit 1, 
chloroplast precursor 
(ADP-glucose 
syninase^ {/\L>r- 
glucose 

pyrophosphorylase) 

D- glucose- 1- 
phosphate adenyl 
transferase) (Shrunken- 
2) 


267 




Accession: 


Swissprotjd: 
PRP3 MOUSE 


GLnumben 
131002 


Description: 
PROLINE- RICH 
PROTEIN MP-3 


385 




Accession: 
Q 10993 


Swissprotjd: 
CYTB_HELAN 


GLnumben 
1706277 


Description: 
CYSTEINE 
PROTEINASE 
INHIBITOR B 
(CYSTATIN B) 
(SCB) 


283 




Accession: 
P49311 


Swissprotjd: 
GRP2_SINAL 


GLnumben 
1346181 


Description: Glycine- 
rich RN A- binding 



-225- 



WO 03/000905 



PCT/TO02/02450 













protein GRP2A 


53 




Accession: 
P39163 


SwissproMd: 
CHAC.ECOLI 


Gi_numben 
12644253 


Description: CATION 
TRANSPORT 
PROTEIN CHAC 


253 




Accession: 
Q9KQX0 


Swissprotjd: 
LPXKVIBCH 


Gi_numben 
14423750 


Description: 
Tetraacyldisaccharide 
4'- kinase (Lipid A 4- 
kinase) 


295- 




Accession: 
Q9I2W7 


Swissprotjd: 
MENG_PSEAE 


GLnumben 
17369015 


Description: S- 

adenosylmethionine:2- 

demethylmenaquinone 

methyltransferase 


389 




Accession: 
PI 3983 


Swissprot_id: 
EXTNJTOBAC 


GLnumben 
119714 


Description: Extensin 
precursor (Cell wall 
hydroxyproline-rich 

glycoprotein) 


225 




Accession: 
P14323 


Swissprotjd: 
GLU4_ORYSA 


GLnumben 
121476 


Description: 
GLUTELIN 
PRECURSOR 


391 




Accession: 
P08453 


Swissprotjd: 
GDB2_WHEAT 


GLnumben 
121101 


Description: I 

GAMMA-GLIADIN 

PRECURSOR 


167 




Accession: 
P32604 


Swissprotjd: 
F26_YEAST 


GLnumben 
1169587 


Description: Fructose- 
2,6-bisphosphatase 


137 




Accession: 
P55238 


Swissprotjd: 
GLGSJTORVU 


GLnumben 
1707940 


Description: Glucose- 
1 -phosphate 
adenylyltransferase 
small subunit, 
chloroplast precursor 
(ADP-glucose 
synthase) (ADP- 
glucose 

pyrophosphorylase) 
(AGPASE B) (Alpha- 
D- glucose- 1- 
phosphate adenyl 
transferase) 


195 




Accession: 
Q02817 


Swissprotjd: 
MUC2_HUMAN 


GLnumben 
2506877 


Description: MUCIN 2 
PRECURSOR 
(INTESTINAL 
MUCIN 2) 


263 




Accession: 


Swissprotjd: 


GLnumben 


Description: ACYL- 



-226- 



WO 03/000905 



PCT/IB02/02450 







022643 


ACBP_FR1AG 


5902717 


COA-BINDING 
PROTEIN (ACBP) 


223 




Accession: 
P07728 


Swissprot_id: 
GUll.ORYSA 


Gi_numbert 
121469 


Description: 
GLUTELIN TYPE I 
PRECURSOR 
(CLONE PREE 61) 


85 




Accession: 
P27937 


Swissprot_id: 
AM3B_ORYSA 


Gi_numben 
113680 


Description: ALPHA- 
AMYLASE 
ISOZYME 3B 
PRECURSOR (1,4- 
ALPHA-D-GLUCAN 

GLUCANOHYDROL 
ASE) 


129 




Accession: 
Q43093 


Swissprot_id: 
UGS3_PEA 


Gi_number 
2833384 


Description: Glycogen 
[starch] synthase, 
chloroplast precursor 
(GBSSII) 

(Granule-bound starch 
synthase II) 


103 




Accession: 
P93594 


Swissprot_id: 
AMYB.WHEAT 


Gi_numben 
3334120 


Description: BETA- 
AMYLASE (1,4- 
ALPHA-D-GLUCAN 
MALTOHYDROLAS 
E) 


51 




Accession: 
P46032 


Swissprot_id: 
PT2B_ARATH 


Gi_numben 
1172704 


Description: Peptide 
transporter PTR2-B 
(Histidine transporting 
protein) 


99 




Accession: 
Q01885 


Swissprot_id: 
RAG2_ORYSA 


Gi_numben 
548671 


Description: SEED 
ALLERGENIC 
PROTEIN RAG2 
PRECURSOR 


69 




Accession: 
Q01401 


Swissprot_id: 
GLGB_ORYSA 


Gi_number 
399544 


Description: 1,4- 
ALPHA-GLUCAN 
BRANCHING 
ENZYME (STARCH 
BRANCHING 
ENZYME) CO- 
ENZYME) 


229 




Accession: 
P07730 


Swissprot_id: 
GLU2_ORYSA 


GLmimben 
121475 


Description: 
GLUTELIN TYPE II 
PRECURSOR 



-227- 



WO 03/000905 



PCT/IB02/02450 



241 




Accession: 
P15839 


Swissprot_id: 
PR01_ORYSA 


Gi_numben 
130946 


Description: 10KD 

PROLAMIN 

PRECURSOR 


91 




Accession: 
PI 7654 


Swissprotjd: 
AMY1J3RYSA 


GLnumber. 
113766 


Description: ALPHA- 
AMYLASE 
PRECURSOR (1,4- 
ALPHA-D-GLUCAN 

GLUCANOHYDROL 
ASE) (ISOZYME IB) 


401 




Accession: 
PI 4323 


Swissprotjd: 
GLU4JDRYSA 


Gi_numben 
121476 


Description: 
GLUTELIN 
PRECURSOR 


121 




Accession: 
P31924 


Swissprotjd: 
SUS2_ORYSA 


Gi_niimben 
401140 


Description: Sucrose 
synthase 2 (Sucrose- 
UDP 

glucosyltransferase 2) 


403 




Accession: 
065806 


Swissprotjd: 
MGNJEUPLA 


GLnumber 
6016561 


Description: Mago 
nashi protein homolog 


187 




Accession: 


Swissprotjd: 
F16P ORY^A 


GLnumber 
3913641 


Description: 
FRUCTOSE- 1 6- 
BISPHOSPHATASE, 
CHLOROPLAST 
PRECURSOR 
(D-FRUCTOSE-1,6- 
BISPHOSPHATE 1- 
PHOSPHOHYDROL 
ASE) (FBPASE) 


13 




Accession: 
Q41001 


Swissprotjd: 
BCP_PEA 


GLnumber 
2493318 


Description: Blue 
copper protein 
precursor 


243 




Accession: 
P20698 


Swissprotjd: 
PR07_ORYSA 


GLnumber 
130959 


Description: 
PROLAMIN PPROL 
17 PRECURSOR 


203 




Accession: 
Q10767 


Swissprotjd: 
GLGX.MYCTU 


GLnumber 
1707945 


Description: Glycogen 
operon protein glgX 
homolog 


407 




Accession: 
Q00808 


Swissprotjd: 
HETl_PODAN 


GLnumber 
3023956 


Description: 
Vegetatible 
incompatibility protein 
HET-E-1 


409 




Accession: 
P47917 


Swissprotjd: 
ZRP4.MAIZE 


GLnumber 
1353193 


Description: O- 
METHYLTRANSFER 



-228- 



WO 03/000905 



PCT/IB02/02450 













ASE ZRP4 (OMT) 


411 

\ 
/ 




Accession: 
P08640 


Swissprotjd: 
AMYH__YEAST 


Gi_numben 
728850 


Description: 
GLUCOAMYLASE 
S1/S2 PRECURSOR 
(GLUCAN 1,4- 
ALPHA- 

GLUCOS1DASE) 

(1,4-ALPHA-D- 

GLUCAN 

GLUCOHYDROLAS 
E) 


105 




Accession: 
P55005 


SwissproHd: 
AMYBJUAIZE 


GLnumben 
1703302 


Description: BETA- 
AMYLASE (1,4- 
ALPHA-D-GLUCAN 
MALTOHYDROLAS 
E) 


107 




Accession: 


Swissprotjd: 
AMYR QOYRN 


GLnumben 


Description: BETA- 
AMYT ASF M4- i 
ALPHA-D-GLUCAN 
MALTOHYDROLAS 
E) 


115 




Accession: 
Q43763 


Swissprot_id: 
AGLUHORVU 


GLnumben 
3023275 


Description: ALPHA- 
GLUCOSIDASE 
PRECURSOR 
(MALTASE) 


1 J 




P25685 


O VV loopi \) l^IU . 

DJBl.HUMAN 


f"ri -niimKw 1 *! - 
VJ1_JIU11IUCI . 

1706473 


l^tO^l J LtllyJl J . i-SlltXJ 

homolog subfamily B 
member 1 (Heat shock 
40 kDa protein 
1) (Heat shock protein 
40) (HSP40) (DnaJ 
protein 

homolog 1)(HDJ-1) 


165 




Accession: 
P27598 


Swissprotjd: 
PHSLJPOBA 


GLnumben 
130172 


Description: Alpha- 1 ,4 
glucan phosphorylase, 
L isozyme, chloroplast 

precursor 
(Starch phosphorylase 

L) 


123 




Accession: 
Q43009 


Swissprotjd: 
SUS3J3RYSA 


GLnumben 
3915054 


Description: Sucrose 
synthase 3 (Sucrose- 
UDP 

glucosyltransferase 3) 



-229- 



WO 03/000905 



PCT/IB02/02450 



205 




Accession: 


Swissprotjd: 
MTIP? HITMAN 


Gi_numben 


Description: MUCIN 2 

(INTESTINAL 
MUCIN 2) 


413 




Accession: 

.T*fUOl/.> 


Swissprotjd: 
APG RRANA 


GLnumben 


Description: ANTER- 

^PFPTFir 
or tv^iriv^ 

PROLINE-RICH 
PROTEIN APG 
(PROTEIN CEX) 


Z\)y 




Accession: 
PI 3526 


^wisspiui_iu . 

ARLC_MAIZE 


vji jiuiijijcr. 

114156 


TiPcrri r\ti/"\n * 

ucdl npiiun . 
ANTHOCYANIN 
REGULATORY LC 
PROTEIN 


323 - 




Accession: 
P70315 


Swissprotjd: 
WASPJMOUSE 


\ji_numuen 
2499130 


jjescnpuon. wiskoti- 
Aldrich syndrome 
protein homolog 
(WASP) 


415 




Accession: 
PI 9837 


Swissprotjd: 
SPD1_NEPCL 


GLnumben 
1174414 


Description: 
SPIDROIN 1 
(DRAGLINE SILK 
FIBROIN 1) 


141 




Accession: 
P55238 


Swissprotjd: 
GLGSJiORVU 


Gi_number 
1707940 


Description: Glucose- 
1 -phosphate 
adenylyltransferase 
small subunit, 
chloroplast precursor 
(ADP- glucose 
synthase) (ADP- 
glucose 

rx\rrr\T\ Vi r^Vi otv 1 51 Cf*^ 
py lUpi lUopi IUI y 1 d5v / 

(AGPASE B) (Alpha- 
D- glucose- 1- 
phosphate adenyl 
transferase) 


77 




Q02723 


^ii/i cctvrr^t in* 
OWI6opiUl_IU. 

RKI1_SECCE 


fii nnmhpf* 

400982 


Oe^rintion* Carbon 
catabolite derepressing 
protein kinase 


65 




Accession: 
P15710 


Swissprotjd: 
PH04_NEUCR 


GLnumben 
130117 


Description: 

PHOSPHATE- 

REPRESSEBLE 

PHOSPHATE 

PERMEASE 


185 




Accession: 


Swissprotjd: 


GLnumben 


Description: 



-230- 



WO 03/000905 



PCT/IB02/02450 







Q41140 


PFPA_RICCO 


2499488 


PYROPHOSPHATE- 
-FRUCTOSE 6- 
PHOSPHATE 1- 
PHOSPHOTRANSFE 
RASE 

ALPHA SUBUN1T 
(PFP) (6- 

PHOSPHOFRUCTO 
KINASE 

(PYROPHOSPHATE) 
) 

(PYROPHOSPHATE 
-DEPENDENT 






















6- 

PHOSPHOFRUCTO 
SE- 1 -KINASE) (PPI- 
PFK) 


299 




Accession: 
P09651 


Swissprotjd: 
ROA1 HUMAN 


GLnumben 
133254 


Description: 
Heterogeneous nuclear 
ribonucleoprotein Al 

(Helix- 
destabilizing protein) 
(Single -strand binding 

protein) 
(hnRNP core protein 
Al) 


67 




Accession: 
P46032 


Swissprotjd: 
PT2B_ARATH 


Gi_numben 
1172704 


Description: Peptide 
transporter PTR2-B 
(Histidine transporting 
protein) 


17 




Accession: 
Q02028 


Swissprot_id: 
HS7S.PEA 


GLnumben 
399942 


Description: Stromal 
70 kDa heat shock- 
related protein, 
chloroplast 
precursor 


279 




Accession: 
P38994 


Swissprot_id: 
MSS4_YEAST 


GLnumben 
1709144 


Description: Probable 
phosphatidylinositoM- 
phosphate 5-kinase 
MSS4 (1- 
phosphatidylinositol-4- 
phosphate kinase) 
(PIP5K) 

(PtdIns(4)P-5-kinase) 



-231 - 



WO 03/000905 



PCT/IB02/02450 













(Diphosphoinositide 
kinase) 


71 




Accession: 
Q08047 


Swissprotjd: 
GLGB_MAIZE 


GLnumben 
1169911 


Description: 1,4-aIpha- 
glucan branching 
enzyme DB, 
chloroplast 
precursor (Starch 
branching enzyme UB) 
(Q-enzyme) 


207 




Accession: 
P49572 


Swissprotjd: 
TRPC.ARATH 


Gi_number. 
1351303 


Description: Indole-3- 
glycerol phosphate 
synthase, chloroplast 
precursor 

(1GPS) 


417 




Accession: 
P28284 


Swissprotjd: 
1CP0_HSV2H 


GLnumben 
124135 


Description: Trans- 
acting transcriptional 
protein ICPO 
(VMW1 18 protein) 


127 




Accession: 
024301 


Swissprotjd: 
SUS2_PEA 


GLnumben 
3915037 


Description: Sucrose 
synthase 2 (Sucrose- 
UDP 

glucosyltransferase 2) 


125 




Accession: 
024301 


Swissprotjd: 
SUS2J>EA 


GLnumben 
3915037 


Description: Sucrose 
synthase 2 (Sucrose- 
UDP 

glucosyltransferase 2) 


183 




Accession: 
Q59126 


Swissprotjd: 
PFP_AMYME 


GLnumben 
3122594 


Description: 
Pyrophosphate— 
fructose 6-phosphate 
1 -phosphotransferase 
(6- 

phosphofructokinase 
(Pyrophosphate)) 
(Pyrophosphate- 
dependent 6- 
phosphofructose- 1 - 
kinase) (PPI- 
PFK) 


419 




Accession: 
Q02897 


Swissprotjd: 
GLUCORYSA 


GLnumben 
544400 


Description: 
GLUTELIN TYPE-B 
2 PRECURSOR 


421 




Accession: 
Q06003 


Swissprotjd: 
GOLI DROME 


GLnumben 
462193 


Description: Goliath 
protein (Gl protein) 



-232- 



WO 03/000905 



PCT7IB02/02450 



29 




Accession: 
P53682 


Swissprotjd: 
CDPl_ORYSA 


Gi_numben 
1705733 


Description: Calcium- 
dependent protein 
kinase, isoform 1 
(CDPK 1) 


297 




Accession: 
P25822 


Swissprotjd: 
PUM_DROME 


Gi_number 
131605 


Description: 
MATERNAL 
PUMILIO PROTEIN 


245 




Accession: 
P45386 


Swissprotjd: 
IGA4JHAEIN 


Gi_numben 
1170517 


Description: 
1MMUNOGLOBULI 
NA1 PROTEASE 
PRECURSOR (IGA1 
PROTEASE) 


427 




Accession: 
Q05654 


Swissprotjd: 
RDPO_SCHPO 


GLnumben 
1710054 


Description: 

RETROTRANSPOSA 
BLE ELEMENT TF2 
155 KDA PROTEIN 


159/171 
X 




Accession: 
P42862 


Swissprotjd: 
G6PA_ORYSA 


Gi_numben 
1169797 


Description: Glucose- 
6-phosphate 
isomerase, cytosolic A 
(GPI-A) 
(Phosphoglucose 
isomerase A) (PGI-A) 
(Phosphohexose 
isomerase A) (PHI- A) 


31 




Accession: 
P46032 


Swissprotjd: 
PT2B_ARATH 


Gi_numben 
1172704 


Description: Peptide 
transporter PTR2-B 
(Histidine transporting 
protein) 


403/431 




Accession: 
P02845 


Swissprotjd: 
VIT2_CHICK 


Gi_numben 
138595 


Description: 
VITELLOGENIN II 
PRECURSOR 
(MAJOR 

VITELLOGENIN) 
[CONTAINS: 
LIPOVITELLIN I 
(LVI); PHOSVITIN 
(PV); LIPOVITELLIN 
11 (LVII); 
YGP40] 


275 




Accession: 
PI 5276 


Swissprotjd: 
ALGP.PSEAE 


GLnumben 
13959675 


Description: 
TRANSCRIPTIONA 
L REGULATORY 
PROTEIN ALGP 



- 233- 



WO 03/000905 



PCT/IB02/02450 













(ALGINATE 
REGULATORY 
PROTEIN ALGR3) 


19 




Accession: 
062830 


Swissprot_id: 
P2CB_BOVTN 


Gi_numben 
10720178 


Description: Protein 
phosphatase 2C beta 
isoform (PP2C-beta) 


151 




Accession: 
Q9LKJ3 


Swissprotjd: 
PHSH.WHEAT 


Gi_number. 
14916632 


Description: Alpha - 
glucan phosphorylase, 
H isozyme (Starch 
phosphorylase H) 


213/227 




Accession: 
P29835 


Swissprot_id: 
GL19_ORYSA 


Gi_numben 
232161 


Description: 19 KD 
GLOBULIN 
PRECURSOR 
(ALPHA- 
GLOBULIN) 


237 




Accession: 
P02595 


Swissprotjd: 
CALM_PATSP 


Gi_number 
115518 


Description: 
CALMODULIN 


133 




Accession: 
Q42968 


Swissprot_id: 
UGST_ORYGL 


Gi_numben 
2833382 


Description: Granule- 
bound glycogen 
[starch] synthase, 
chloroplast 
precursor 


239 




Accession: 
Q09151 


Swissprot_id: 
GLU3_ORYSA 


Gi_numben 
1707986 


Description: 
GLUTELDM TYPE-A 
III PRECURSOR 


161 




Accession: 
Q43772 


Swissprotjd: 
UDPG_HORVU 


GLnumber. 
6136111 


Description: UTP— 
GLUCOSE- 1- 
PHOSPHATE 
URIDYL YLTRANSF 
ERASE (UDP- 
GLUCOSE 
PYROPHOSPHORY 
LASE) (UDPGP) 
(UGPASE) 


61 




Accession: 
P70545 


Swissprot_id: 
NDC2_RAT 


Gi_number. 
2499525 


Description: Intestinal 

sodium/dicarboxylate 

cotransporter 

(Na(+)/dicarboxylate 

cotransporter) 


47 




Accession: 
P25297 


Swissprot_id: 
PH84_YEAST 


Gi_number. 
1346710 


Description: 
INORGANIC 
PHOSPHATE 
TRANSPORTER 



- 234- 



WO 03/000905 



PCT/IB02/02450 













PH084 


219 




Accession: 
P20698 


Swissprotjd: 
PR07JDRYSA 


GLnumben 
130959 


Description: 
PROLAMIN PPROL 
17 PRECURSOR 


435 




Accession: 
Q01881 


Swissprotjd: 
RA05_ORYSA 


GLnumben 
548657 


Description: SEED 
ALLERGENIC 
PROTEIN RA5 
PRECURSOR 


259/271 




Accession: 
Q42980 


Swissprotjd: 
OLEl.ORYSA 


GLnumben 
3334280 


Description: 
OLEOSIN 16 KD 
(OSE701) 


93 




Accession: 
P46573 


Swissprotjd: 
APKB.ARATH 


GLnumben 
12644274 


Description: PROTEIN 
KINASE APK1B 


441 




Accession: 
Q03685 


Swissprotjd: 
BIP5JTOBAC 


GLnumben 
729623 


Description: Luminal 
binding protein 5 
precursor (BiP 5) (78 
kDa glucose- 
regulated protein 
homolog 5) (GRP 78- 
5) 


111 




Accession: 
Q99758 


Swissprotjd: 
ABC3.HUMAN 


GLnumben 
7387524 


Description: ATP- 
binding cassette, sub- 
family A, member 3 
(ATP-binding 
cassette transporter 3) 
(ATP-binding cassette 
3) (ABC-C 
transporter) 


73 




Accession: 
Q08047 


Swissprotjd: 
GLGBJvlAIZE 


GLnumben 
1169911 


Description: 1,4-alpha- 
glucan branching 
enzyme KB, 
chloroplast 
precursor (Starch 
branching enzyme IIB) 
(Q- enzyme) 


443 




Accession: 
Q03685 


Swissprotjd: 
BIP5JTOBAC 


GLnumben 
729623 


Description: Luminal 
binding protein 5 
precursor (BiP 5) (78 
kDa glucose- 
regulated protein 
homolog 5) (GRP 78- 
5) 


235 




Accession: 


Swissprotjd: 


GLnumben 


Description: 
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P14614 


GLU5_ORYSA 


121477 


GLUTEL1N 
PRECURSOR 


217 




Accession: 
PI 7048 


Swisspiot_id: 
PR02_ORYSA 


Gi_numben 
6174927 


Description: 13 KD 

PROLAMIN 

PRECURSOR 


257 




Accession: 
Q40646 


Swissprotjd: 
OLE2_ORYSA 


Gi_number. 
3334279 


Description: 
OLEOSIN 18 KD 
(OSE721) 


201 




Accession: 
P47735 


Swissprotjd: 
RLK5_ARATH 


Gi_numben 
1350783 


Description: Receptor- 
like protein kinase 5 
precursor 


445 




Accession: 
P21997 


Swissprot_id: 
SSGP_VOLCA 


Gijnumben 
134920 


Description: 

SULFATED 

SURFACE 

GLYCOPROTEIN 

185(SSG185) 


281 




Accession: 
P38999 


Swissprot_id: 
LYS9.YEAST 


Gi_numben 
729968 


Description: 

SACCHAROPINE 

DEHYDROGENASE 

[NADP+, L- 

GLUTAMATE 

FORMING] 


251 




Accession: 
Q00195 


Swissprot_id: 
CNG2_RAT 


Gijnumben 
116574 


Description: Cyclic- 
nucleotide-gated 
olfactory channel 
(Cyclic-nucleotide- 
gated cation channel 2) 
(CNG channel 2) 
(CNG2) (CNG-2) 
(OCNC1) 


3 




Accession: 
P47735 


Swissprotjd: 
RLK5_ARATH 


Gi_number. 
1350783 


Description: Receptor- 
like protein kinase 5 
precursor 


447 




Accession: 
060683 


Swissprotjd: 
PEXA_HUMAN 


Gi_numben 
3914299 


Description: 
Peroxisome assembly 
protein 10(Peroxin- 
10) 


21 




Accession: 
P46573 


Swissprotjd: 
APKB.ARATH 


Gi_numben 
12644274 


Description: PROTEIN 
KINASE APK1B 


179 




Accession: 
P21343 


Swissprotjd: 
PFPB_SOLTU 


Gi_numben 
2507174 


Description: 

Pyrophosphate— j 
fructose 6-phosphate 
1 - phosphotransferase 
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beta subunit 
(PFP) (6- 

phosphofructokinase 
(Pyrophosphate)) 
(Pyrophosphate- 
dependent 6- 
phosphofroctose- 1 - 
kinase) (PPI- 
PFK) 


319 




Accession: 
Q64467 


Swissprot_id: 
G3PT_MOUSE 


Gi_numben 
2494630 


Description: 
GLYCERALDEHYD 
E 3-PHOSPHATE 
DEHYDROGENASE, 
ESTIS-SPEC1FIC 
(GAPDH) 


7 




Accession: 
P20346 


Swissprotjd: 
P322_SOLTU 


Gi_numben 
129350 


Description: Probable 
protease inhibitor P322 
precursor 


291 




Accession: 
008816 


Swissprotjd: 
WASL_RAT 


Gi_numben 
13431956 


Description: Neural 
Wiskott-Aldrich 
syndrome protein (N- 
WASP) 


169 




Accession: 
064459 


Swissprotjd: 
UDPG_PYRPY 


Gi_numben 
6136112 


^Description: UTP— 
glucose- 1 -phosphate 
uridy ly ltransferase 
(UDP- glucose 
pyrophosphorylase) 
(UDPGP) (UGPase) 


83 




Accession: 
P27933 


Swissprotjd: 
AM3D.ORYSA 


Gi_number 
113682 


Description: ALPHA- 
AMYLASE 
ISOZYME 3D 
PRECURSOR (1,4- 
ALPHA-D-GLUCAN 

GLUCANOHYDROL 
ASE) 


269 




Accession: 
014939 


Swissprotjd: 
PLD2_HUMAN 


Gi_numben 
13124441 


Description: 
PHOSPHOL1PASE 
D2(PLD2) 
(CHOLINE 
PHOSPHATASE 2) 

(PHOSPHATIDYLC 
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HOLINE- 
HYDROLYZING 
PHOSPHOLIPASE 
D2) (PLD1C) 


95 




Accession: 
Q01885 


Swissprotjd: 
RAG2_ORYSA 


GLnumber 
548671 


Description: SEED 
ALLERGENIC 
PROTEIN RAG2 
PRECURSOR 


9 




Accession: 
Q03387 


Swissprotjd: 
IF41JWHEAT 


GLnumber. 
1 1 70504 


Description: Eukaiyotic 
initiation factor (iso)4F 
subunit P82-34 
(eIF-(iso)4F P82-34) 


440 




A pp p^inn* 

P50897 


Swissnrot id* 
PPT_HUMAN 


Gi number 
1709747 


Description: Palmitoyl- 
protein thioesterase 
precursor 
(Palmitoyl-protein 
hydrolase) 


451 




Accession: 
P47179 


Swissprotjd: 
DAN4_YEAST 


GLnumben 
1352944 


Description: Cell wall 
protein DAN4 
precursor 


277 




Accession: 
Q02817 


Swissprotjd: 
MUC2_HUMAN 


GLnumber 
2506877 


Description: MUCIN 2 
PRECURSOR 
(INTESTINAL 
MUCIN 2) 


285 




Accession: 
P25822 


Swissprotjd: 
PUMJDROME 


GLnumber 
131605 


Description: 
MATERNAL 
PUMILIO PROTEIN 


453 




Accession* 
P06921 


Swissnrot id* 
VE2.HPV05 


GL number 
1352839 


Description: 
REGULATORY 
PROTEIN E2 


265 




Accession: 
P40602 


Swissprotjd: 
APG_ARATH 


GLnumber 
728867 


Description: ANTER- 
SPECIFIC 
PROLINE-RICH 
PROTEIN APG 
PRECURSOR 


327 




Accession: 
Q08759 


Swissprotjd: 
MYB XENLA 


GLnumber 
730090 


Description: Myb 
protein 


231 




Accession: 
P27164 


Swissprotjd: 
CAL3J>ETHY 


GLnumber 
115492 


Description: 
CALMODULIN- 
RELATED PROTEIN 


37 




Accession: 
P46032 


Swissprotjd: 
PT2B_ARATH 


GLnumber 
1172704 


Description: Peptide 
transporter PTR2-B 
(Histidine transporting 
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protein) 


455 




Accession: 
P02845 


Swissprot_id: 
VIT2_CHICK 


Gi_numben 
138595 


Description: 
VITELLOGENIN II 
PRECURSOR 
(MAJOR 

VITELLOGENIN) 
[CONTAINS: 
L1POVITELLIN I 
(LVI); PHOSVITIN 
(PV); LIPOV1TELLIN 
II (LV1I); 
YGP40] 


43 




Accession: 
P93766 


Swissprot_id: 
MLO.HORVU 


Gi_numben 
6016588 


Description: MLO 
PROTEIN 


457 




Accession: 
Q07878 


Swissprot_id: 
VP13_YEAST 


Gi_numben 
2499125 


Description: 
VACUOLAR 
PROTEIN 
SORTING- 
ASSOCIATED 
PROTEIN VPS 13 


459 




Accession: 
Q50634 


Swissprot_id: 
SECD_MYCTU 


GLnumber 
2498898 


Description: Protein- 
export membrane 
protein secD 


293 




Accession: 
P29141 


Swissprot_id: 
SUBV_BACSU 


GLnumber. 
135023 


Description: Minor 
extracellular protease 
VPR precursor 


321 




Accession: 
P0U03 


Swissprot_id: 
MYB_CH1CK 


Gi_number 
127591 


Description: Myb 
proto-oncogene 
protein (C-myb) 


79 




Accession: 
P08640 


Swissprot_id: 
AMYH.YEAST 


GLnumber 
728850 


Description: 
GLUCOAMYLASE 
SI/S2 PRECURSOR 
(GLUCAN 1,4- 
ALPHA- 
GLUCOSIDASE) 
(1,4-ALPHA-D- 
GLUCAN 

GLUCOHYDROLAS 
E) 


211 




Accession: 
P08079 


Swissprot_id: 
GDB0_WHEAT 


GLnumber 
121099 


Description: 

GAMMA-GLIADIN 

PRECURSOR 


177 




Accession: 


Swissprot_id: 


GLnumber 


Description: 
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P46256 


ALF1_PEA 


1168408 


FRUCTOSE- 
BISPHOSPHATE 
ALDOLASE, 
CYTOPLASMIC 
ISOZYME 1 


461 




Accession: 
Q02817 


Swissprot_id: 
MUC2.HUMAN 


Gi_numben 
2506877 


Description: MUCIN 2 
PRECURSOR 
(INTESTINAL 
MUCIN 2) 



A11 publications, patents and patent applications are incorporated herein by reference. While 
in the foregoing specification this invention has been described in relation to certain preferred 
embodiments thereof, and many details have been set forth for purposes of illustration, it will be 
apparent to those skilled in the art that the invention is susceptible to additional embodiments and that 
certain of the details described herein may be varied considerably without departing from the basic 
principles of the inventioa 
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What is claimed is : 

A polynucleotide comprising a nucleotide sequence encoding a polypeptide the activity of which is 
involved in or associated with the synthesis, metabolism or degradation of carbohydrates in the plant 
grain and the expression of which is up- regulated during grain filling, which nucleotide sequence is 
substantially similar to a sequence encoding a polypeptide as given in SEQ ID NOs: 70 - 210 or a 
partial- length polypeptide having substantially the same activity as the full-length polypeptide, e.g., at 
least 50%, more preferably at least 80%, even more preferably at least 90% to 95% the activity of 
the full-length polypeptide. 

2. The polynucleotide of claim 1 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 69 - 209 or a part thereof which still 
encodes a partial-length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NO: 69 - 
209, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

3. A polynucleotide according to claim 1 comprising a nucleotide sequence encoding a 
polypeptide which is involved in associated with starch biosynthsis and up-regulated during grain 
filling, which nucleic acid molecule is substantially similar to a nucleic acid encoding a polypeptide as 
given in SEQ ID NOs: 70 - 1 88 or a partial- length polypeptide having substantially the same activity 
as the full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably 
at least 90% to 95% the activity of the full-length polypeptide. 
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4. The polynucleotide of claim 3 comprising a nucleotide sequence 

a) asgiveninanyoneoftheSEQIDNOsoftable7suchasSEQ!DNOs: 69- 
187or a part thereof which still encodes a partial- length polypeptide having 
substantially the same activity as the lull-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NOs: 69 - 
187, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

0 which is the reverse complement of (a), (b) or (c). 

5. The polynucleotide of claim 3 comprising a nucleotide sequence encoding a polypeptide with 
an activity of a small and large subunit ADPG pyrophosphorylase, respectively, which nucleotide 
sequence is substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ 
ID NOs: 1 36 - 1 42 or a partial- length polypeptide having substantially the same activity as the full- 
length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide. 

6. The polynucleotide of claim 5 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 135 - 141 or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
fuD-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 



-242- 



WO 03/000905 



PCT/IB02/02450 



d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 



consecutive nucleotides of nucleotides given in SEQ ID NO: 135 - 141, or the 



5 



complement thereof; 
e) complementary to (a), (b) or (c); and 
0 which is the reverse complement of (a), (b) or (c). 



7. A polynucleotide according to claim 3 comprising a nucleotide sequence encoding a 
polypeptide involved in starch structure rearrangement, which nucleic acid molecule is substantially 
similar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 76 - 78 exhibiting 

10 isoamylase debranching enzyme activity; 70 - 74 exhibiting a branching enzyme activity, 80 - 92 
exhibiting an a-amylase activity; 94-100 exhibiting an a-arnylase inhibitor activity; 110 exhibiting a 
pullulanase activity; 102- 108 exhibiting a 6- amylase activity; 1 12- - 1 18 exhibiting a a-glucosidase 
activity, or a partial-length polypeptide having substantially the same activity as the full-length 
polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at least 90% to 

1 5 95% the activity of the full- length polypeptide. 

8. The polynucleotide of claim 7, comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: : 75 - 77 exhibiting isoamylase 

debranching enzyme activity, 69 - 73 exhibiting a branching enzyme activity, 79 
20 - 91 exhibiting an a-amylase activity; 93-99 exhibiting an a-amylase inhibitor 



activity; 109 exhibiting a pullulanase activity; 101 - 107, exhibiting a B- amylase 
activity; 1 1 1 - - 1 1 7or a part thereof which still encodes a partial- length 



25 



polypeptide having substantially the same activity as the full-length polypeptide, 
e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide; 



b) 



having substantial similarity to (a); 

capable of hybridizing to (a) or the complement thereof; 

capable of hybridizing to a nucleic acid comprising 50 to 200 or more 

consecutive nucleotides of a nucleotide sequence given in SEQ ID NOs: : 75 - 



c) 
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77 exhibiting isoamylase debranching enzyme activity, 69 - 73 exhibiting a 
blanching enzyme activity, 79 - 91 exhibiting an a-amylase activity; 93 - 99 
exhibiting an a-amylase inhibitor activity; 109 exhibiting a pullulanase activity; 
1 0 1 - 1 07, exhibiting a B-amylase activity; 1 1 1 - - 1 1 7; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

9. A polynucleotide according to claim 3 comprising a nucleotide sequence encoding a 
polypeptide exhibiting an amylase or an amylase inhibitor activity, which nucleic acid molecule is 
substantially similar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 80 - 92 
exhibiting an a-amylase activity; and 94 - lOOexhibiting an a-amylase inhibitor activity, or a partial- 
length polypeptide having substantially the same activity as the lull- length polypeptide, e.g., at least 
50%, more preferably at least 80%, even more preferably at least 90% to 95% the activity of the 
full- length polypeptide. 

1 0. The polynucleotide of claim 9 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 79 - 91 exhibiting an a-amylase activity; 
and 93 - 99 exhibiting an a-amylase inhibitor activity or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
lull- length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in SEQ ID NOs: 79-91 
exhibiting an a-amylase activity; and 93 - 99exhibiting an a-amylase inhibitor 
activity, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 
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11. A polynucleotide according to claim 3 comprising a nucleotide sequence encoding a 
polypeptide exhibiting a sucrose synthase activity, which nucleic acid molecule is substantially similar 
to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 120 - 1 28 or a partial-length 

5 polypeptide having substantially the same activity as the full-length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the activity of the full- 
length polypeptide. 

1 2. The polynucleotide of claim 1 1 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 1 1 9 - 1 27 or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of anucleotide sequence given in SEQ ID NOs: 119- 
127 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

13. A polynucleotide according to claim 3 comprising a nucleotide sequence encoding a 
polypeptide exhibiting a glucanase activity, which nucleic acid molecule is substantially similar to a 
nucleic acid encoding a polypeptide as given in SEQ ID NOs: 192 or a partial-length polypeptide 

25 having substantially the same activity as the full-length polypeptide, e.g., at least 50%, more 

preferably at least 80%, even more preferably at least 90% to 95% the activity of the full-length 
polypeptide. 

14. The polynucleotide of claim 13 comprising a nucleotide sequence 



15 



-245 - 



WO 03/000905 



PCT/TO02/02450 



a) as given in SEQ ID NO: 1 91 or a part thereof which still encodes a partial- 
length polypeptide having substantially the same activity as the full-length 
polypeptide, e.g., at least 50%, more preferably at least 80%, even more 
preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of nucleotides given in SEQ ID NO: 191 or the 
complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

1 5. A polynucleotide comprising a nucleotide sequence encoding a seed storage protein, which 
nucleic acid molecule is substantially similar to a nucleic acid encoding a polypeptide as given in SEQ 
ID NOs: 21 2 - 250 or a partial-length polypeptide having substantially the same activity as the full- 
length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide. 

1 6. The polynucleotide of claim 1 5 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 2 1 1 - 249 or a part thereof which still 
encodes a partial-length polypeptide having substantially the same activity as the 
M- length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 21 1 - 249 or the complement thereof; 

e) complementary to (a), (b) or (c); and 
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f) which is the reverse complement of (a), (b) or (c). 

1 7. The polynucleotide of claim 1 5 comprising a nucleotide sequence encoding a glutelin protein 
the expression of which is up-regulated during grain filling, which nucleic acid molecule is substantially 
similar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 224 , 236 , and 240 or a 
partial-length polypeptide having substantially the same activity as the lull-length polypeptide, e.g., at 
least 50%, more preferably at least 80%, even more preferably at least 90% to 95% the activity of 
the full-length polypeptide. 

1 8. The polynucleotide of claim 1 7 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 223 , 235 , and 239 or a part thereof 
which still encodes a partial-length polypeptide having substantially the same 
activity as the full-length polypeptide, e.g., at least 50%, more preferably at least 
80%, even more preferably at least 90% to 95% the activity of the full-length 
polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 223 , 235 , and 239, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

1 9. A polynucleotide according to claim 1 5 comprising a nucleotide sequence encoding a pnolamin 
protein the expression of which is up-regulated during grain filling, which nucleotide sequence is 
substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ ID NOs: 
218, 220, 226 and 242 or a partial- length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at 
least 90% to 95% the activity of the full-length polypeptide. 
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20. The polynucleotide of claim 19 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 21 7, 21 9, 225 and 241 or a part thereof 
which still encodes a partial-length polypeptide having substantially the same 
activity as the full-length polypeptide, e.g., at least 50%, more preferably at least 
80%, even more preferably at least 90% to 95% the activity of the fall-length 
polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 217, 219, 225 and 241, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

21. A polynucleotide according to claim 15 comprising a nucleotide sequence encoding a gliadin 
protein, the expression of which is up- regulated during grain filling, which nucleotide sequence is 
substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ ID NOs: 

2 1 2, 2 1 9; 234, 248; and 250 or a partial- length polypeptide having substantially the same activity as 
the fall- length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at 
least 90% to 95% the activity of the fall-length polypeptide. 

22. The polynucleotide of claim 21 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 211, 220; 233, 247; and 249 or a part 
thereof which still encodes a partial-length polypeptide having substantially the 
same activity as the fall- length polypeptide, e.g., at least 50%, more preferably at 
least 80%, even more preferably at least 90% to 95% the activity of the fall- 
length polypeptide; 

b) having substantial similarity to (a); 
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c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 135325; 135133; 10825, 135101; and 135103, or the complement 
thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 



23. A polynucleotide the expression of which is up-regulated during grain filling comprising a 
nucleotide sequence encoding a polypeptide that is involved in or associated with fatty acid synthesis 
or lipid metabolism, which nucleotide sequence is substantially similar to a nucleic acid sequence 
encoding a polypeptide as given in SEQ ID NOs: 252 - 280 or a partial-length polypeptide having 
substantially the same activity as the full-length polypeptide, e.g., at least 50%, more preferably at 
least 80%, even more preferably at least 90% to 95% the activity of the full-length polypeptide. 



24. The polynucleotide of claim 23 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 25 1 - 279 or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of nucleotides given in any one of SEQ ID NOs: 251 - 
279 or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 
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25. A polynucleotide according to claim 23 comprising a nucleotide sequence encoding an oleosin 
protein, which nucleotide sequence is substantially similar to a nucleic acid sequence encoding a 
polypeptide as given in SEQ ID NOs: 258 and 260 or a partial-length polypeptide having 
substantially the same activity as the full-length polypeptide, e.g., at least 50%, more preferably at 
least 80%, even more preferably at least 90% to 95% the activity of the full-length polypeptide. 

26. The polynucleotide of claim 25 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 257 and 259 or a part thereof which still 
encodes a partial- length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 257 and 259, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

0 which is the reverse complement of (a), (b) or (c). 

27. A polynucleotide according to claim 23 comprising a nucleotide sequence encoding a 
polypeptide the activity of which is involved in or associated with the dehydrogenation of phytoene 
and the expression of which is up-regulated during grain filling, which nucleotide sequence is 
substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ ID NO: 278 
or a partial- length polypeptide having substantially the same activity as the full-length polypeptide, 
e.g., at least 50%, more preferably at least 80%, even more preferably at least 90% to 95% the 
activity of the full-length polypeptide. 

28. The polynucleotide of claim 27 comprising a nucleotide sequence 
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a) as given in any one of SEQ ID NOs: 277 or a part thereof which still encodes a 
partial- length polypeptide having substantially the same activity as the full-length 
polypeptide, e.g., at least 50%, more preferably at least 80%, even more 
preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 277, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 

29. A polynucleotide comprising a nucleotide sequence that encodes a polypeptide that acts as a 
transcription factor and the expression of which is up- regulates during grain filling, which nucleotide 
sequence is substantially similar to a nucleic acid sequence encoding a polypeptide as given in SEQ 
ID NOs: 302-328 or a partial- length polypeptide having substantially the same activity as the fiill- 
length polypeptide, e.g., at least 50%, more preferably at least 80%, even more preferably at least 
90% to 95% the activity of the full-length polypeptide. 

30. The polynucleotide of claim 29 comprising a nucleotide sequence 

a) as given in any one of SEQ ID NOs: 301-327 or a part thereof which still 
encodes a partial-length polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: 301-327, or the complement thereof; 
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e) 
0 



complementary to (a), (b) or (c); and 

which is the reverse complement of (a), (b) or (c). 



31. A polynucleotide comprising a nucleotide sequence encoding a polypeptide the activity of 
which is involved or associated with the metabolism of amino acids and the expression of which is 
up-regulated during grain filling, which nucleotide sequence is substantially similar to a nucleic acid 
sequence encoding a polypeptide as given in SEQ ID NOs: 282 - 300 or a partial-length 
polypeptide having substantially the same activity as the lull- length polypeptide, e.g., at least 50%, 
more preferably at least 80%, even more preferably at least 90% to 95% the activity of the full- 
length polypeptide. 

32. The polynucleotide of claim 31 comprising a nucleotide sequence 



a) as given in any one of SEQ ID NOs: 281 - 299 or a part thereof which still 
encodes a partiaUength polypeptide having substantially the same activity as the 
full-length polypeptide, e.g., at least 50%, more preferably at least 80%, even 
more preferably at least 90% to 95% the activity of the full-length polypeptide; 

b) having substantial similarity to (a); 

c) capable of hybridizing to (a) or the complement thereof; 

d) capable of hybridizing to a nucleic acid comprising 50 to 200 or more 
consecutive nucleotides of a nucleotide sequence given in any one of SEQ ID 
NOs: : 281 - 299, or the complement thereof; 

e) complementary to (a), (b) or (c); and 

f) which is the reverse complement of (a), (b) or (c). 



33. A polypeptide which has an amino acid sequence encoded by any one of the polynucleotides 
according to claims 1 to 32. 
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34. A polypeptide according to claim 33, which has an amino acid sequence encoded by a 
polynucleotide selected from the group consisting of SEQ ID NOs: 1 to 461 , 50 1 -5 1 1 , and 5 1 3- 
641. 

5 35. A polypeptide according to claim 33 wherein said polypeptide has at least 90% amino acid 
sequence identity to a polynucleotide selected from the group consisting of SEQ ID NOs: 2 - 462, 
502-512, and 514-642. 

36. An isolated nucleic acid molecule comprising a nucleotid sequence, which nucleotide sequence 
10 is obtained or obtainable from plant genomic DNA comprising a gene having an open reading frame 

(ORF) encoding a polypeptide which has at least between 70%, and 99% amino acid sequence 
identity to a polypeptide encoded by an Oryza, e.g., Oryza saliva, gene comprising a nucleotide . 
sequence as given in SEQ ID NOs: 1 to461, 501-511, and 513-641. 

37. A recombinant vector comprising a polynucleotide of any of claims 1 to 32 and 36. 

15 

38. An expression cassette comprising as operably linked components, a promoter, a 
polynucleotide of any of claims 1 -32 and 36 and a termination sequence. 

39. A host cell comprising all or parts of a vector and/or an expression cassette of claims 37-38. 

20 

40. The host cell of claim 39 wherein said host cell is a bacterial cell, a yeast cell, an animal cell or 
a plant cell. 

4 1 . The host cell of claim 40, wherein said plant cell is from a cereal plant 

25 

42. A plant comprising a host cell of any of claims 39 - 4 1 
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43. A plant according to claim 42, wherein said plant is selected from the group consisting of 
maize, soybean, barley, alfalfa, sunflower, tomato, banana, canola, cotton, peanut, sorghum, 
tobacco, sugarbeet, wheat, and rice. 

5 44. A method of modulating carbohydrate composition of the plant grain, comprising functionally 
integrating an isolated nucleic acid molecule according to anyone of claims 1 to 14 comprising a 
nucleic acid sequence encoding a polypeptide, which is involved in or associated with the synthesis, 
metabolism or degradation of carbohydrates in the plant grain and the expression of which is up- 
regulated during grain filling, into a cell, group of cells, tissue or organ of a plant 

10 

45. A method of modulating the protein content and composition of the plant grain, comprising 
functionally integrating an isolated nucleic acid molecule according to anyone of claims 15 to 22 
comprising a nucleic acid sequence encoding a polypeptide, which is involved in or associated with 
the synthesis, metabolism or degradation of seed storage proteins in the plant grain and the 

15 expression of which is up-regulated during grain filling, into a cell, group of cells, tissue or organ of a 
plant. 

46. A method of modulating the fatty acid and/or lipid content and composition of the plant grain, 
comprising functionally integrating an isolated nucleic acid molecule according to anyone of claims 23 

20 to 28 comprising a nucleic acid sequence encoding a polypeptide, which is involved in or associated 
with fatty acid synthesis or lipid metabolism in the plant grain and the expression of which is up- 
regulated during grain filling, into a cell, group of cells, tissue or organ of a plant. 

47. A method of modulating the grain filling process of the plant grain, comprising functionally 
25 integrating an isolated nucleic acid molecule according to anyone of claims 28 to 30 comprising a 

nucleic acid sequence encoding a transcription factor polypeptide, which is involved in or associated 
with the regulation and coordination of grain filling in plants and the expression of which is up- 
regulated during grain filling, into a cell, group of cells, tissue or organ of a plant. 
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48. A method of modulating the amino acid content and composition of the plant grain, comprising 
functionally integrating an isolated nucleic acid molecule according to anyone of claims 31 to 32 
comprising a nucleic acid sequence encoding a polypeptide the activity of which is involved or 
associated with the metabolism of amino acids and the expression of which is up-regulated during 

5 grain filling, into a cell, group of cells, tissue or organ of a plant. 

49. A method of modulating nutrient content and composition of the plant grain, comprising: 

a) functionally integrating 

i an isolated nucleic acid molecule according to anyone of claims 1 to 14; 15-22; 23- 
10 28; 28-30 and 31 to 32, or a portion thereof in an anti- sense orientation; or 

il an dsRN Ai construct comprising an isolated nucleic acid molecule according to anyone 
of claims 1 to 14; 15-22; 23-28; 28-30 and 31 to 32, or a portion thereof in both a 
sense and an anti- sense orientation, which, optionally, may be separated by a spacer 
region; 

15 under the transcriptional control of regulatory sequences required for expression in plants, into a 

cell, group of cells, tissue or organ of a plant; and 

b) expressing the constructs as provided in a) above in a cell, group of cells, a tissue or organ 
of a plant to produce a RNA transcript. 

20 50. A method of identifying or isolating polynucleotide sequences that are orthologous to a nucleic 
acid molecule according to anyone of claims 1 to 14; 15-22; 23-28; 28-30 and 31 to 32 comprising 
a nucleic acid fragment encoding a polypeptide that is up-regulated during grain filling, from the 
genome of another plant, wherein all or a portion of a particular nucleic acid sequence according to 
anyone of claims 1 to 14; .15-22; 23-28; 28-30 and 31 to 32 is used as a probe that selectively 

25 hybridizes to gene sequences present in a population of cloned genomic DNA fragments or cDNA 
fragments from a chosen source organism. 

51. A method to identify a nucleic acid molecule encoding a polypeptide the expression of which is 
up-regulated during grain filling 
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a) contacting a plurality of isolated nucleic acid samples comprising all or a portion of a particular 
nucleic acid sequence according to anyone of claims 1 to 14; 15-22; 23-28; 28-30 and 31 to 32 on 
a solid substrate with a probe comprising plant nucleic acid corresponding to RNA isolated from a 
specific plant tissue during grain filling so as to form a complex, wherein each sample comprises a 

5 plurality of oligonucleotides corresponding to at least a portion of one plant gene; and 

b) contacting a second plurality of isolated nucleic acid samples comprising all or a portion of a 
particular nucleic acid sequence according to any one of claims 1 to 14; 15-22; 23-28; 28-30 and 
3 1 to 32 on a solid substrate with a second probe comprising plant nucleic acid corresponding to 
RNA that is taken at a different development stage of the plant; 

10 c) comparing complex formation in a) with complex formation in b) 

so as to identify which samples correspond to genes that are expressed during grain filling. 

52. A method for detecting the presence of a polynucleotide according to any one of claims 1 to 
14; 1 5-22; 23-28; 28-30 and 3 1 to 32, or a fragment or a variant thereof, or a complementary 

15 sequence thereto in a sample, the method including the following steps of: 

a) bringing into contact a nucleotide probe or a plurality of nucleotide probes which can hybridize 
with a polynucleotide according to any one of claims 1 to 14; 15-22; 23-28; 28-30 and 31 to 
32, or a fragment or a variant thereof, or a complementary sequence thereto and the sample to 
be assayed. 

20 b) detecting the hybrid complex formed between the probe and a nucleotide in the sample. 

53. A kit for detecting the presence of a polynucleotide according to any one of claims 1 to 14; 
1 5-22; 23-28; 28-30 and 3 1 to 32, or a fragment or a variant thereof, or a complementary 
sequence thereto in a sample, the kit including a nucleotide probe or a plurality of nucleotide probes 

25 which can hybridize with a nucleotide sequence comprised within a polynucleotide according to any 
one of claims 1 to 1 4; 1 5-22; 23-28; 28-30 and 3 1 to 32, or a fragment or a variant thereof, or a 
complementary sequence thereto and, optionally, the reagents necessary for performing the 
hybridization reaction. 
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54. A method of modifying the frequency of a grain filling gene in a plant population, comprising 
the steps of: 

a) screening a plurality of plants using an oligonucleotide as a marker to determine the 
presence or absence of a grain filling gene in an individual plant, the oligonucleotide 
consisting of not more than 300 bases of a nucleotide sequence selected from the group 
consisting of SEQ ID NOs 1 to SEQ ID NO: 461 , 

b) selecting at least one individual plant for breeding based on the presence or absence of 
the grain filling gene; and 

c) breeding at least one plant thus selected to produce a population of plants having a 
modified frequency of the grain filling gene. 

55. A method according to claim 54, wherein the oligonucleotide comprises a simple sequence 
repeat (SSR) sequence comprising at least two consecutive repeat units of an SSR, the start and end 
points of which are provided in Tables 2 and 3., and a flanking sequence of at least about 14 nucleic 
acids immediately adjacent to said at least two consecutive repeat units. 

56. A method of plant breeding to select for or against a trait of interest which is associated with 
grain filling in plants, comprising the steps of: 

a. identifying the trait of interest; identifying at least one oligonucleotide that can be used as 
a marker for the trait, the oligonucleotide consisting of not more than 300 bases of a 
nucleotide sequence selected from the group consisting of SEQ ID NOs: 1 to SEQ ID 
NO: 461, 

b. screening at least one plant for the presence of the at least one oligonucleotide; 

c. selecting at least one plant based on presence or absence of the at least one 
oligonucleotide; 

d. breeding at least one plant thus selected to produce a population of plants having a 
modified frequency of the at least one oligonucleotide; and 
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e. screening at least one plant of the population for the presence or absence of the grain 
filling trait 

57. A method according to claim 56, wherein the oligonucleotide comprises a simple sequence 
repeat (SSR) sequence comprising at least two consecutive repeat units of an SSR, the start and end 
points of which are provided in Tables 2 and 3., and a flanking sequence of at least about 14 nucleic 
acids immediately adjacent to said at least two consecutive repeat units. 

58. A method of determining a varietal identity of a plant, comprising: 

a) obtaining a nucleic acid sample from a plant; 

b) identifying at least one oligonucleotide to obtain an oligonucleotide profile for the plant, 
wherein the oligonucleotide consists of not more than 300 bases of a nucleotide 
sequence selected from the group consisting of SEQ ID NOs: 1 to SEQ ID NO: 461, 
the oligonucleotide comprising a simple sequence repeat (SSR) sequence comprising at 
least two consecutive repeat units of an SSR, the start and end points of which are 
provided in Tables 2 and 3., and a flanking sequence of at least about 14 nucleic acids 
immediately adjacent to said at least two consecutive repeat units in the sample; and 

c) comparing the SSR profile to at least one known SSR profile corresponding to at least 
one known variety to determine the varietal identity of the plant. 

58. An oligonucleotide primer consisting of between 8 and 150 bases which comprises at least 14 
bases selected from the group of flanking sequences obtainable from a nucleotide sequence 
provided in SEQ ID NOs: 3435 to SEQ ID NO: 150133, which at least 14 bases are 
immediately adjacent to at least two consecutive repeat units of an SSR, the start and end 
points of which are provided in Tables 2 and 3. 

59. A computer-readable medium having stored thereon a data structure comprising: 
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a. Sequence information of a polynucleotide according to any one of claims 1 to 14; 15- 
22; 23-28; 28-30 and 31 to 32 and/or ; and a polynucleotide according to any one of 
claims ... to .... 

b) a module receiving the nucleic acid molecule which compares the nucleic acid sequence 
5 of the molecule to at least one other nucleic acid sequence. 
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an extent that no meaningful International Search can be carried out, specifically: 

see FURTHER INFORMATION sheet PCT/ISA/210 



3 * ^ because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 
Box II Observations where unity of Invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

see additional sheet 

1.1 I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 
I — I searchable claims. 

2, [I as all searchable claims could be searched without effort justifying an additional fee. this Authority did not invite payment 
of any additional fee. 

3 I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' 1 — ' covers only those claims for which fees were paid, specifically claims Nos.: 



4 Fvl No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
" L * J restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 

see FURTHER INFORMATION sheet, invention 1. 



Remark on Protest 




The additional search fees were accompanied by the applicant's protest. 



j [ No protest accompanied the payment of additional search fees. 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



Continuation of Box 1.2 

Claims Nos.: Claim renumbered as claim 59 concerning oligonucletide primer 

Claim 58 (cf. p. 258 lines 9-19) is followed by a second claim with the 
same number (cf. p258 lines 21-25). The claims are not numbered 
consecutively. For the purpose of defining the inventions the claims 
are renumbered accordingly, claim 59 follows the first mentioned claim 
58. 



The claim renumbered as claim 59 concerns oligonucleotide primers. The 
primer is defined as " consisting of between 8 and 150 bases which 
comprise at least 14 bases ". Since it is not possible for an 
oligonucleotide of 8 bases to comprise 14 bases it is unclear as to what 
the claim refers. In addition it is not evident from the claim if the 
sequences provided in the SEQ ID NOS: 3435 to SEQ ID NO: 150133 • are the 
flanking sequences since the claim refers to " flanking sequences 
obtainable from " said sequences, in any case there is no further 
characterisation of these SEQ ID NOS. either in the description or the 
sequence listing. The claim also attempts to relate these flanking 
sequences to the SSRs of tables 2 and 3. However since the skilled 
person is left in doubt as to the actual features or constitution of the 
claimed sequences this does not clarify the claim. Hence claim 59 does 
not meet the requirement of Articles 5 and 6 PCT .The definition of the 
claimed oligonucleotide so lacks clarity that the examining division is 
unable to associate the claimed subject matter with any of the 
inventions mentioned in the communication pursuant to Art 17(3) (a) PCT, 
and no search has been carried out for the subject matter of this claim. 

The applicant's attention is drawn to the fact that claims relating to 
inventions in respect of which no international search report has been 
established need not be the subject of an international preliminary 
examination (Rule 66.1(e) PCT). The applicant is advised that the EPO 
policy when acting as an International Preliminary Examining Authority is 
normally not to carry out a preliminary examination on matter which has 
not been searched. This is the case irrespective of whether or not the 
claims are amended following receipt of the search report or during any 
Chapter II procedure. If the application proceeds into the regional phase 
before the EPO, the applicant is reminded that a search may be carried 
out during examination before the EPO (see EPO Guideline C-VI, 8.5), 
should the problems which led to the Article 17(2) declaration be 
overcome. 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 



Invention 1: Claims 1-4,7,8,33-44,49-58,60 all partially. 

Polynucleotide and polypeptide as defined in the claims by 
reference to SEQ ID NOS: 69 and 70, methods based on said 
polynucleotide. 



Inventions 2-302: Claims 1-58 and 60 all partially in so far as 
they relate to the subject matter as follows. 

Polynucleotide and polypeptide sequences as defined in the 
claims by reference to SEQ ID N0S:l-68, 71-462, and 501-642, 
methods based on said sequence. Each odd numbered 
polynucleotide paired with the following even numbered 
polypeptide representing an individual invention. 
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